Input filet Fbh8099FL;_ Output File Fbh8099FL. tra 
Sequence length 2725 

CCACGCXSTCCGGCCTTCCGAAATAGAAACAAAGTT^ 
CTTCrTTTAGCATGCTATTATGGGGAAAGTC^ 

MVPVENTEGPSLLN 14 

GGCTCTAACTTCTACGTGACC ATG GTA CCT GTT GAA AAC ACC GAG GGC CCC AGT CTG CTG AAC 42 

QKGTAVETEGSGSRH PPWAR 34 

CAG AAG GGG ACA GCC GTG GAG ACG GAG GGC AGC GGC AGC CGG CAT CCT CCC TGG GCG AG A 102 

GCGMFTFLSSVTAAVSGLLV 54 

GGC TGC GGC ATG TTT ACC TTC CTG TCA TCT GTC ACT GCT GCT GTC AGT GGC CTC CTG GTG 162 

G YELG I ISGALLQI KTLLAL 74 

GGT TAT GAA CTT GGG ATC ATC TCT GGG GCT CTT CTT CAG ATC AAA ACC TTA TTA GCC CTG 2 22 

SCHEQEMVVSSLVIGALLAS 94 

AGC TGC CAT GAG CAG GAA ATG GTT GTG AGC TCC CTC GTC ATT GGA GCC CTC CTT GCC TCA 282 

5LTGGVL I DRYG RRTA I I LS S 114 

rCTC ACC GGA GGG GTC CTG ATA GAC AGA TAT GGA AGA AGG ACA GCA ATC ATC TTG TCA TCC 3 42 

VfcLLGLGS LVLI LS L SYTVL I 134 

2rGC CTG CTT GGA CTC GGA AGC TTA GTC TTG ATC CTC AGT TTA TCC TAC ACG GTT CTT ATA 4 02 

UVGRIAIGVSISLSSIATCVY 154 

jJSTG GGA CGC ATT GCC ATA GGG GTC TCC ATC TCC CTC TCT TCC ATT GCC ACT TGT GTT TAC 462 

L^I AEIAPQHRRGLLVS LNELM 174 

==sATC GCA GAG ATT GCT CCT CAA CAC AGA AGA GGC CTT CTT GTG TCA CTG AAT GAG CTG ATG 522 

I ' V . I G I L S AY I S NY A F ANV F H 194 

"JaTT GTC ATC GGC ATT CTT TCT GCC TAT ATT TCA AAT TAC GCA TTT GCC AAT GTT TTC CAT 582 

-^GWKYMFGLVIPLGVLQAIAM 214 

GGC TGG AAG TAC ATG TTT GGT CTT GTG ATT CCC TTG GGA GTT TTG CAA GCA ATT GCA ATG 642 

YFLPPS PRFLVMKGQEGAAS 234 

TAT TTT CTT CCT CCA AGC CCT CGG TTT CTG GTG ATG AAA GGA CAA GAG GGA GCT GCT AGC 702 

KVLGRLRALSDTTEELTVIK 254 

AAG GTT CTT GGA AGG TTA AGA GCA CTC TCA GAT ACA ACT GAG GAA CTC ACT GTG ATC AAA 762 

SSLKDEYQYSFWDLFRSKDN 274 

TCC TCC CTG AAA GAT GAA TAT CAG TAC AGT TTT TGG GAT CTG TTT CGT TCA AAA GAC AAC 822 

MRTRIMIGLTLVFFVQITGQ 294 

ATG CGG ACC CGA ATA ATG ATA GGA CTA ACA CTA GTA TTT TTT GTA CAA ATC ACT GGC CAA 882 

PNILFYASTVLKSVGFQSNE 314 

CCA AAC ATA TTG TTC TAT GCA TCA ACT GTT TTG AAG TCA GTT GGA TTT CAA AGC AAT GAG 942 

AASLASTGVGVVKVISTIPA 334 

GCA GCT AGC CTC GCC TCC ACT GGG GTT GGA GTC GTC AAG GTC ATT AGC ACC ATC CCT GCC 1002 
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It L L V D VGSKTFLC IGSGVM 354 

/ACT CTT CTT GTA GAC CaT GTC GGC AGC AAA ACA TTC CTC TG .TT GGC TCC TCT GTG ATG 1062 

AASLVTMGIVNLNIHMNFTH 374 

GCA GCT TCG TTG GTG ACC ATG GGC ATC GTA AAT CTC AAC ATC CAC ATG AAC TTC ACC CAT 1122 

ICRSHNSINQSLDESVIYGP 394 

ATC TGC AGA AGC CAC AAT TCT ATC AAC CAG TCC TTG GAT GAG TCT GTG ATT TAT GGA CCA 1182 

GNLSTNNNTLRDHFKGISSH 414 

GGA AAC CTG TCA ACC AAC AAC AAT ACT CTC AGA GAC CAC TTC AAA GGG ATT TCT TCC CAT 1242 

SRSSLMPLRNDVDKRGETTS 434 

AGC AGA AGC TCA CTC ATG CCC CTG AGA AAT GAT GTG GAT AAG AGA GGG GAG ACG ACC TCA 13 02 

ASLLNAGLSHTEYQIVTDPG 454 

GCA TCC TTG CTA AAT GCT GGA TTA AGC CAC ACT GAA TAC CAG ATA GTC ACA GAC CCT GGG 13 62 

DVPAFLKWLSLASLLVYVAA 474 

GAC GTC CCA GCT TTT TTG AAA TGG CTG TCC TTA GCC AGC TTG CTT GTT TAT GTT GCT GCT 1422 

y K FSIGLGPMPWLVLSEIFPGG 494 

HTTT TCA ATT GGT CTA GGA CCA ATG CCC TGG CTG GTG CTC AGC GAG ATC TTT CCT GGT GGG 1482 

yjlRGRAMALTSSMNWGINLLI 514 

^4ATC AGA GGA CGA GCC ATG GCT TTA ACT TCT AGC ATG AAC TGG GGC ATC AAT CTC CTC ATC 1542 

QlSLTFLTVTDLIGLPWVCFIY 534 

fljTCG CTG ACA TTT TTG ACT GTA ACT GAT CTT ATT GGC CTG CCA TGG GTG TGC TTT ATA TAT 1602 

^TIMSLASLLFVVMFIPETKG 554 

i . ACA ATC ATG AGT CTA GCA TCC CTG CTT TTT GTT GTT ATG TTT ATA CCT GAG ACA AAG GGA 1662 

fU C S h E Q I S MELAKVNYVKNNI 574 

M= TGC TCT TTG GAA CAA ATA TCA ATG GAG CTA GCA AAA GTG AAC TAT GTG AAA AAC AAC ATT 1722 

fiCFMSHHQEELVPKQPQKRKP 594 

U TGT TTT ATG AGT CAT CAC CAA GAA GAA TTA GTG CCA AAA CAG CCT CAA AAA AGA AAA CCC 1782 

QEQLLECNKLCGRGQSRQLS 614 

CAG GAG CAG CTC TTG GAG TGT AAC AAG CTG TGT GGT AGG GGC CAA TCC AGG CAG CTT TCT 1842 

PET* 618 

CCA GAG ACC TAA 1854 

TGGCCTCAACACCTTCTGAACGTGGATAGTGCCAGAACACTTAGGAG 

CTGTGCTCTCTTTTCAGTGTCATGGAACTGGTTTT^ 

CTCCCCAGAAGGAACCTCAAAAGGTAGATGAGGTACAAGGTCCTAM 

AAAAAAAAAAGTTACTGGCTGGTTTAATACTTTCTACCTTC 

AGACATCAACCTCCGCCTTAAGCTATGTATGTATGGAGGCCAGTCGCAGCTTTAT^ 

AC^TGAGGGTACAGTTTCTGCCTACCAAGACACTACTTGCACTGGATCT^ 

GGACAACTGCCCATATATTCTATCTAGATTAGGAGAO 

ACAAGTATAAAGATTATAGAGCTTATTTTATGAACTATAAACTATAAT 

GTTAATATTGTGAAATATTAAAATAATTCCGCAATAAAAAAAAAA 
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Protein Family / Domain Matches, HMMer version 2 

Searching for complete domains in PFAM 

hmmpfam - search a single seq against HMM database 

HMMER 2.1.1 (Bee 1998) 

Copyright (C) 1992-1998 Washington University School of Medicine 

HMMER is freely distributed under the GNU General Public License (GPL) . 



HMM file: /prod/ddm/seqanal/PFAM/pf am6 .4/Pfam 

Sequence file: /prod/ddm/wspace/orf anal/oa-script . 80 89 . seq 



Query: 8099 

Scores for sequence family classification {score includes all domains) : 
Model Description Score E-value N 



sugar_tr Sugar (and other) transporter 318.2 9.6e-92 

1 

FecCD_family FecCD transport family -218.2 6,9 

1 

MCT Monocarboxylate transporter -235.8 2,7 

1 



Parsed for domains: 

Model Domain seq-f seq-t hmm-f hmm-t score E~ value 



FecCD_family l/l 26 227 . . 1 311 [] -218.2 6.9 

sugar_tr l/l 43 564 . . 1 488 [] 318.2 9.6e-92 

MCT 1/1 29 567 1 611 [3 -235.8 2.7 



Alignments of top-scoring domains: 

FecCD_f amily : domain 1 of 1, from 26 to 227: score -218.2, E = 6.9 

*->GalsispadvlqalfgggtegeievdeliiwdltlrRLPRvLlAlLV 
G+++ ++a ++ + + + ++ + +1LV 

8099 26 GSRHPPWARGCGMFT FLSS VTAA VSGLLV 54 

GAaLAVaGAi iQg 1 1 RJSIPLAs Pg i 1G in sGAs Igwl a i vl f pgg 1 s i s a 
G + lGi sGA 1 + ++1 + + ++ 

8099 55 G YELG 1 1 SGALLQ I KTLLALS CH - E QEMV 82 

lyl lp sf Af aGal iaal IVyl lawkgrngl spvrLiLaGial sal f sAl t 
+1+++A++ +1++++1+ +++ + i+ls+++ +1 

80 99 83 VS S L V I GALLAS LTGG VL I DR YGRR TAI ILSSCLLGLG 12 0 

tlllllsddlqdqqalfWltGSlsgrnWedvklalpilliglplalllar 
+1 1+ls + +++ + gr v + 1 + + + +a + 
8099 121 SLVLILSLSYTVL IVGRIAIGVSISLSSIATCVYIAEI 158 

qLnvLsLGddt AkgLGvnvervR .11111 1 w 1 L t G a a VA v AGp I g F VGL 
+++ R+11++1 +++ +G+ 
809 9 15 9 APQHRRgLLVSLNELMIV IGI 179 

ivPHiaRrLvGt . dhrwLLPaSALlGAiLLHADHARtlf aPiElPvGi 
+ +i h w ++++ + +P G+ 

8099 180 LS AY I SNYAFANvFHGW K YMFGL V I PLGV 208 



vTAl iGaPyFl YLLrr< - * 

+ A+ a+yFl + + + + + +L+- + -h 
8099 209 LQAI- -AMYFLppsprFLVMK 227 



sugarjr.- domain 1 of 1, from 43 to 564: score 318.2, E = 9.6e-92 

* ->valvaalgGgf If GyDtgviggf lalidf If rfglltssgalaelvg 
+++aa+ G +1 Gy +g+i+g+l +i+ 1 s+ ++ 
8099 43 S S VT AAVS G - LLVG YE LG IIS GALLQ I K TLLALS CHEQE 80 
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ystvltglwsifflGrliGslfaGklgdrfGRkksllialvlfviGall 
+ws + + +G+1 + si +G l+dr+GR+ +++++++1 G+1 + 
8099 81 MWSSLVIGALLASLTGGVLIDRYGRRTAIILSSCLLGLGSLV 123 

sgaapgy tTiGl wa f yl 1 ivGRvl vGlgvGgasvl vPmYisE iAPkalRG 
+++ ++ +livGR+ +G + + 4- s ++ +Yi+EiAP + RG 

8099 124 LILSL SYTVLIVGRIAIGVS ISLSS IATCVYIAEIAPQHRRG 165 

algslyqlaitiGilvAaiiglglnktnndsalnswgWRiplglqlvpal 
l+sl++l+i+iGil A+1+++++++++ gW+ ++gl + +++ 

8099 166 LLVSLNELMIVIGILSAYISNYAFANVFH GW KYMFGLVT PLGV 208 

llligllfl PES PRwLvekgkl e eAre vLakl rgvedvdqe iqeikaele 
l++i++ flP SPR+Lv+kg++ A +vL + lr +d+++e+ ik + 1 + 
8099 209 LQAIAMYFLPPSPRFLVMKGQEGAASKVLGRLRALSDTTEELTVIKSSLK 258 

atvseekagkaswgel f rgrt rpkvrqr 1 lmgvmlqaf qQl tGiNa if YY 
+ + + + + + +lfr++ +r r+++g +1 +f Q tG i +Y 

8099 259 DEYQYS FWDL FRS KD - - NMRTR I M I GLT LVF F VQ I TGQ PN I L F Y 300 

sptifksvGvsdsvasllvtiivgwNfvf TfvaLif lvDrfGRR 

++t++ksvG+++ a 1 + + +vgw +++T++a + lvD+ G + + 
8099 301 AS TVLKS VGFQ SNE AAS LASTG VG WKV I S T I PA - TLL VDHVGS K t f 1 c i 349 



+ + + + + + ++++++ + + + 4-+ +- + 4- + + + + + + + + + 

8099 350 gssvmaaslvtmgivnlnihmnf thicrshnsinqsldesviygpgnlst 399 
pllllGaagmaicf lilgasi 

+ + + + + + + + + + ++4-+ + + + + + + + + + +3+ + + + + 

8099 400 nnntlrdhfkgisshsrsslmplrndvdkRGETTSASLLNAGLSHT 445 

gval 1 1 1 nkpkdp s skaag i va i vf i 1 1 f ia f Fa 1 gwGp ipwvi 1 sEl FP 

+ +4 p 4 + 4 44-44 +l+++a+F++g Gp+pW+ + 1 SE+ FP 

80 99 44 6 - - EYQ I VTDP - GDVPAFLKWLSLASLLVYVAAFS IGLGPMPWLVLSE I FP 492 

tkvRskalalataanwlanf iigf lfpyitgaiglalggyvf lvf agllv 
++R++a4al+ ++nw +n +i+++f ++t + ig ++v +44444 + 
8099 493 GGIRGRAMALTSSMNWGINLLISLTFLTVTDLIG LPWVCFIYTIMSL 53 9 

If ilfvff fvPETkGrtLEeieelf<-* 
++ Ifv +f +PETkG +LE+1+ + 
8099 540 ASLLFWMFI PETKGCSLEQ I SMEL 564 

MCT: domain 1 of 1 , from 29 to 567: score -23 5.8, E = 2.7 

*->kpPDGGwGWvWfasFlingfvdGf iks f Gvf f sellqeet 

pP W+ ++F + +v +++ + 4 G++ llq t 

8 099 2 9 HPP WARGCGMFTFLSSVTAAVSGl IvgyeLGI ISGALLQIKT 70 

IfnesksdvdtAwIgSimlavllf sGPls . SilvnrfGcRivmiaGglla 
1 s + + + ++S+ ++ 11+s 1 +++1 +r+G R+ +i4 ++1 
8099 71 LIiALSCHE - QEMWSSLVIGALLAS - - LTgGVLIDRYGRRTAI ILSSCLL 117 

gaGlllasFstniwelyltfGvitGlGfgf if qPaivilgqYF . eKrRsl 
g+G 1+ + s ++ +1++ + +G+ ++++ + + v +++ ++ rR+1 
8099 118 GLGSLVLILSLSYTVLIVGR- IAIGVSISLSSIATCVYIAEIApQHRRGL 166 

AtGiAvaGsGvGtwfppllqf lidny . .GsDWrgal. . .lilggillnc 
4+ +G+ + + + ++ 4 n+ +G W++ ++ + lg+++ + 

8099 167 LVS LNE LM I V I G I - L SAY I SN YAF ANV f hG - - WKYMFg 1 v I P LGVLQ A I - 212 

vicGalllRPlepsvpqdekdkeqetlkeakkkkendtettkeeteplks 
a++ P + + + + +++ + a++k+ + + te+l++ 
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8099 213 AMYFLP PSPRFLVMKGQEGAASKVLGRLRALSDTTEELTV 252 

IpkasilkledakaersvdsLlsskSvgerdksqlsekqksqasgrpsss 

8099 253 I 253 

atavqlvllrsrlekadlplkrvrvsrrrvlSkVSaeSgtdgersSgylN 
+ s+l 

8099 254 KSSL 257 

rkdvFYtGs i sNvae f kedpdkYrss s Ihgt r t tvgnae s qs 1 1 r ldds r 

8099 - - 

esgdgdsssedlsektrgdggkkessskeiretikkllDf svlknrtFll 

k ++ +++++ ++ + + + r ++ 
8099 258 KDEYQYS FWDLFR S KDNMRTR I M I G 2 82 

yaisnlf aslGf fvPlvf LvsYaikslgldekeAsf Llsi . iGvsnivGR 
+++ +++ G 1 + + + ks g ++eA+ L s+++Gv+ +++ 
8099 283 LTLVFFVQ I TGQPNI LFYAS TVL - KSVGFQSNEAASLASTgVGWKVI ST 331 

pifGlvADkkgvrpTarhivyifnlsllalGlttlacPlatsfwgLwyc 
+ + + 1 + D+ g ++ + +s++a+ 1 t+ + + ++ + c 

8099 332 I PATLLVDHVGS KT FLCIGSSVMAASLVTMGIVNLNIHMNFTHIC 376 

ilFGfs 

+++++ ++ + ++++ ++++++ + + + ++ +++++++ + +++ 
8099 377 RSHNSInqsldesviygpgnlstnnntlrdhfkgisshsrsslmplrndv 426 



++++++++ + + + ++ + + + ++ +++ + + + + + 
8099 427 dkrgettsasllnaglshteyqivtdpgdvpaf lkwlslasllvyvaaf s 476 

iGsygaLtfwLvdLvg 

+ ++ + ++ +++ +++ +++ + +i ++Ltf +++dL+g 

8099 4 77 iglgpmpwlvlseifpggirgramaltssmnwglNLLISLTFLTVTDLIG 526 

Wlekf snAfGllllfeGvavLvGPPiaGlLvDakttgdYtvaFyf sGill 
+ + ++++++ +a+L ++ + + t + + ++ + 
8099 527 LPWVCFIYTIMSLASLLFWMF IPE- -TKGCSLEQIS M 562 

llsgl<-* 
1+ + 

8099 563 ELAKV 567 



FIGURE 3C 



) multiple sequence alignment 

MVPVENTEGPSLLNQKGTAVETEGSGSRHPPWARGCGMFTFLSSVTAAVSGLLVGYELr 

- - TRANS - - P - - RTERMPDAKKQG RSNKAMTFFVCFLAALAGLLFGLDIG 

* : * .::..:.:* *. : ** : **..***.* . . * 

I SGALLQ I KTLLALS CHEQEMWS S LVI GALLAS LTGGVL I DRYGRRTA 1 1 LS SC LLGL 

IAGALPFIADEFQITSHTQEWWSSMMFGAAVGAVGSGWLSFKLGRKKSLMIGAILFVA 
*:*** * : ** ****; :: ** .... * * . ** * 

SLVLILSLSYTVLIVGRIAIGVSISLSSIATCVYIAEIAPQHRRGLLVSLNELMIVIGI 

SLFSAAAPNVEVLILSRVLLGLAVGVASYTAPLYLSEIAPEKIRGSMISMYQLMITIGI 
**• = ■ ***:.*: :*:::.::* :: :* :: **** :: ** ..*. 

SAYI SNYAFANVFHGWKYMFGLVI PLGVLQAIAMYFLPPS PRFLVMKGQEGAAS KVLGR 

GAYLSDTAFS- YTGAWRWMLGVI IIPAILLLIGVFFLPDSPRWFAAKRRFVDAERVLLR 
.**.*. * * . .*::*:*:;* . : * * . : : * * * ***.. * . * . * 

RALSDTTE-ELTVIKSSLKDEYQYSFWDLFRSKDNMRTRIMIGLTLVFFVQITGQPNIL 
RDTS AEAKRELDE I RESLQVK - Q - SGWALFKENSNFRRAVFLGVLLQVMQQFTGMNVIM' 
* * : : ** *:.**::*****:.:.*:* :::*:*.:*;** * : 

YASTVLKSVGFQSNEAASLASTGVGVVKVISTIPATLLVDHVGSKTFLCIGSSVMAASL , 
YAPKI FEIAGYTNTTEQ^GWIVGLTNVLATFIAIGLVDRWGRKPTLTLGFLVMAAG- 



**..::: .*: . . . : . **:.:*::*; * *** . * * * . * * * 



* * 



TMGIVNLNIHMNFTHICRSHNSINQSLDESVIYGPGNLSTNNNTLRDHFKGISSHSRSSl 

- MGVLGTMMH I GI--HS 

**::.:*: ** ** 

MPLRNDVDKRGETTSASLLNAGLSHTEYQIVTDPGDVPAFLKWLSLASLLVYVAAFSIGI 

_p SA QYFAI AMLLMF I VGFAMSi 

* ** :::::***:::..*::. 

GPMPWLVLSEIFPGGIRGRAMALTSSMNWGINLLISLTFLTVTDLIGLPWVCFIYTIMSI 

GPLIWVLCSEIQPLKGRDFGITCSTATNWIANMIVGATFLTMLNTLGNANTFWVYAAIjl^ 
**: *:: **** *. . :: ::: ** * ::: * * * * . . . * ..*. . 

ASLLFVVMFIPETKGCSLEQISMEIAKVNWKNNICFMSH^ 

LFILLTLWLVPETKHVSLEHI ERNLMKGRKLR EI GAHD 

: : **** *** : * . ; * * . : - * 

ECNKLCGRGQSRQLS PET 
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USTAL W (1.74) multiple sequence alignment 



^8099FL 
9830 | ARAE 



MVPVENTEGPSLLNQKGTAVETEGSGSRHPPWARGCG - MFTFLS S VTAAVSGLLVGYELG 
- -TRANS- -P- -RTERMVTINTESALrT- -PRSLRDTRRMNMFVS - VAAAVAGLLFGLDIG 
*: * .:: .:::**.: : * *. * *:* * : *** : *** * 



18099FL 
>83 0 | ARAE 



IISGALLQIKTLIiALSCHEQEMWSSLVIGALLASLTGGVLIDRYGRRTAIILSSCLLGL 
VIAGALPFITDHFVLTSRLQEWVVSSMML^^ 

;* . * * * ** : : • ♦ . * . * 



* . * * * * 



** ****...** 



18099FL 
830 | ARAE 



8099FL 
83 0 I ARAE 



GSLVLILSLSYTVLIVGRIAIGVSISLSSIATCVYIAEIAPQHRRGLLVSLNELMIVIGI 
GSIGSAFATSVEMLIAARWLGIAVGIASYTAPLYLSEMASENVRGKMISMYQL^^VTLGI 

: * :: :*::*:*.:: * * : : * : :**:.:** 



★ * * - . * . . 



L S A Y I SN Y AF ANVFHG - WKYMFG - L V I P LG VLQA I AMY FLPPSPRF L VMKGQEG AAS KVL 

VLAFLSDTAFS - -YSGNWRAMLGVLALP-AVLLIILVVFLPNSPRWLxAEKGRHIEAEEVL 
* * . . * * - * . * * . * * * * . * * * * * * . * * * . * . * * 



8099FL 
830 



GRLRALSDTT-EELTVIKSSLKDEYQYSFWDLFRSKDNMRTRIMIGLTLVFFVQITGQPN 
RMLRDTSEKAREEIJSTEIRESLKLK-Q-GGWALFKINRNVRRAVFLGMLLQAMQ 

* * * - - * * * * . * * * • * * * * - - * . * . . . * . * . * - * * 



8099PL 
830 |*RAE 



ILFYASTVLKSVGFQSNEAASLASTGVGWKVISTI^ 

IMYYAPRIFKMAGFTTTEQQMIATLVVGLTFMFATFIAVFTVDKAGRKPALKIGFSVMA^ 
: : * . ** :.* : * : * . : * . * * * * * * * 



8099a, 
330 | ARAE 



SLVTMGIVNLNIHMNFTHICRSHNSINQSLDESVIYGPGNLSTNNNTLRDHFKGISSHSR 

GTLVLG YC 

. * * 



3099pL S S LMPLRND VDKRGETTSAS LLNAGLSHTE YQ I VTD PGDVPAFLKWLSLAS LLVYVAAF S 

530 |AJIAE - -LMQFDN G-TASS GLS WLSVGMTMMCIAGYA 

~ * * . * * * . * . * * * * * * . . * 



i099PL 
S30|ARAE 



I GLGPMPWLVLS E I FPGGI RGRAMALTS SMNWG INLL I S LTFLTVTDL I GLPWVCF I YT I 
MSAAPWWI LCSE I QPLKCRDFG I TCSTTTNWVSNMI IGATFLTLLDS I GAAGTFWL YTA 
* . * . . * * * * * * * * . . * ***** *** 



J099FL 
)3Q I ARAE 



MSLASLLFVVMFIPETKGCSLEQISMELAKVNYVK^ 

LNIAFVGITFWLIPETKNVTLEHIERKIjMAGEKLRN- IGV 

.* - . . * * * * * . * * - * .* - .,** 



099FL 
30 | ARAE! 



QLLECNKLCGRGQSRQLS PET 
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TRANS PRTERMVTINTESALT- - PRSLRDTRRMNMFVSVAAAVAGLLFGLDIGA 

TRANSPRTERMPDAKKQG RSNKAMTFFVCFLAALAGLLFGLDl^ 

MVPVENTEGPSLUJQKGTAVETEGSGSRHPPWARGCGMFTFLSSVTAAVSGLLVGYELG] 
*..* .:: * . :.::..**:;***. *..*. 

IAGALPFITDHFVLTSRLQEWVVSSMMLGAAIGALFNGWLSFRLGRKYSLMAGAILFVLC 
IAGALPFIADEFQITSHTQEWWSSMMFGAAVGAVGSGWLSFKLGRKKSLMIGAILFVAC 
ISGALLQIKTLLALSCHEQEMWSSLVIGALLASLTGGVLIDRYGRRTAIILSSCLLGLC 
• • • • • • • • «ZI .1 * : * 

SIGSAFATSVEMLIAARWLGIAVGIASYTAPLYLSEMASEWRGKMISMYQLMVTLGIV 
SLFSAAAPNVEVL. I LSRVLLGLAVGVAS YTAPLYLSE I APEKI RGSMI SMYQLM I T IG I L 

SLVLILSLSYTVLIVGRIAIGVSISLSSIATCVYIAEIAPQHRRGLLVSLNELMI VIGIL 
* : : . ;** . * : :*:::.::* :: * * i i * • • * * . 

LAFLSDTAFS - - YSGNWRAMLGVLALP - AVLLI ILWFLPNSPRWLAEKGRHI EAEEVLR 
GAYLSDTAFS - - YTGAWRWMLGVI IIP-AI LLLIGVFFLPDSPRWFAAKRRFVDAERVLL 
SAYI SNYAFANVFHG - WKYMFG - LVI PLGVLQAI AMYFLPPS PRFLVMKGQEGAAS KVLG 

... . . . . . » .* - ... . . . 

MLRDTS EKAREELNE I RES LKLK- Q - GGWALFKINRNVRRAVFLGMLLQAMQQFTGMNII 
RLRDTS AEAKRELDE I RESLQVK -Q- SGWALFKENSNFRRAVFLGVLLQVMQQFTGMNVI 
RLRALSDTT-EELTVIKSSLKDEYQYSFVTOLFRSKDNMRTRIMIGLTLVFFVQITGQPNI 
* * * . .** * z . * * ; : * . * * * : i * . * * . ★ 

MYYAPRIFKMAGFTTTEQQMIATLVVGLTFM 
MYYAPKIFEIiAGYTNTTEQMWGTVIVGLTNVLATFIAI 

LFYASTVXjKS VGFQSNEAASLASTGVGVVKVI ST I PATLLVDHVGS KTFLC I GS S VMAAS 
::**. ::: .*: .: ** : **.***.* 

TLVLGYCLMQFDN GTASSG- - 

MGVLG-TMMHI GIHSPS - - 

LVTMGIVNLNIHMNFTHICRSHNSINQSLDESVIYGPGNLSTNNNTIiRDHFKGISSHSRS 
.:*::: * * . 

LS WLS VGMTMMC I AGYAM 

AQYFAIAMLLMFIVGFAM 

SLMPLRISTDVDKRGETTSASLLN^ 



SAAPWWI LCSE IQP - - LKCRDFG I TC S TTTNWVSNM 1 1 GATFLTLLDS IGAAGTFWLYT 
SAGPLI WVLCSEIQP - -LKGRDFGITCSTATNWIANMI VGATFLTMLNTLGNANTFWVYA 
GLGPMPWL VLS E I F PGG I RGRAMALTS S - - MNWG I NLL I S LTFLTVTDL I GL P WVC F I YT 
.*: * : : *** * :: * * * * : : : . * * * * : : : * ; : * : 

AI^IAFVGITFWLIPETKNWLEHIERKLMAGEKIiRN IGV 

ALNVLF I LLTLiWLiVPETKHVSLEH I ERNLMKGRKLRE IGAHD 

IMSLASLLFWMFI PETKGCSLEQ I SMELAKVNYVKNNI CFMSHHQEELVPKQPQKRKPQ 



EQLLECNKLCGRGQSRQLS PET 
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hup i 



!LUSTAL W (1.74) multiple sequence alignment 



'bh8099FL 
'02168Patent 



bh8099FL 
02168Patent 



bh8099FL 
02168Patent 



bh8099FL 
02168Patent 



bh8S^9FL 
02liSPatent 



bh80B9FL 
0216§Patent 



bh8&9FL 
02 lSjj Patent 



bh8^9FL 
02 16B Patent 



bh8099FL 
02168Patent 



•bh8099FL 
'02168Patent 



bh8099FL 
02168Patent 



MVPVENTEGPSLLNQKGTAVETEGSG SRHPPWARG - CGMFTFLS SVTAAVSGLLVGY 

MTSDHEHMTAVCASHVQTHGSQLQIQKLSPCFRPPTPAFRISSSIILLGAG-LAGP 

.*.*** . * * * « ** . . * * * 

. » . . • ■ • « • • . » 

ELGI ISGALLQIKTLLALSCHEQEM^ 

STGDRWFGVSWGTGLFLPPLQLLLPPRLLFTHAILERLHLWLALPPVLVLGHALLH-CK 

* * * * - * * . * * - . * * 

• . . ... .. 

LGLGSLVLILSLSYTVLIVGRIAIGVS1SLSSIATCVYIAEIAPQH- -RRGLLVSLNELM 

VGGSTARAGDQLVQRVLLL- IVFLHRWVQVWPZGTEVDILGMGSRTGGRRGPELRP G 

. * . * * * . . .. • - * * * . . * * * 

m m 0 • «... . • . . • .... . 

IVIGILSAYISNYAFANVFHGWKYMFGLVIPLGVLQAIAMYFLPPSPRFLVMKGQEGAAS 

FR I S I LS AYI S2STYAFANVFHGWKYMFGLVI PLGVLQAI AMYFLPPS PRFLVMKGQEGAAS 
. * ******************************************************** 

KVLGRLRALSDTTEELTVIKSSLKIDEYQYSFWDLFRSKDNMRTRIMIGLTLVFFVQITGQ 
KVLGRLRALSDTTEELTVI KS S LKDEYQYS FWDLFRS KDNMRTR I M I GLTLVFFVQ I TGQ 
************************************************************ 

PNI LF YAS TVLKS VGFQSNEAAS LAS TGVGWKVI ST I PATLLVDHVGS KTFLC I GS S VM 

PNILFYASTVLKSVGFQSNEAASLASTGVGVVKVISTIPATLLVDHVGSKTFLCIG 

******************************************************** 

AASLVTMGIVNLNIHMNFTHICRSHNSINQ 



SRSSLMPLRNDVDKRGETTSASLLNAGLSHTEYQIVTDPGDVPAFLKWLSLASLLVYVAA 

LLNAGLSHTEYQIVTDPGDVPAFLKWLSLASLLVYVAA 

************************************** 

F S I GLGPMPWLVLS E I FPGGI RGRAMALTS SMNWGINLL I S LTFLTVTDLIGLPWVCFIY 
FSIGLGPMPWLVLSEIFPGGIRGRAMALTSSMNWGINLLISLTFLTVN-LIGLPWVCFIY 
*********************************************** . *********** 

TIMSLASLLFVVMFIPETKGCSLEQISMELAKV^ 
TIMSLASLLFVVMFIPETKGCSLEQISMEL^ 

************************************************************ 

QEQLLECNKLCGRGQSRQLSPET 
QEQLLECNKLCGRGQSRQLSPET 
*********************** 
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Input file Fbh46455FL.seq; Output File Fbh46455FL . tra 

Sequence length 2230 

GTCGACCCACGCGTCCGGCAACATG GCG^ 
TACCCCGGGTGAGGGGTGGCCTCCGCGTGGGATCGTGCC^ 

GCACGTCCCCTCCGCGCTGTGTGTCTACTGAGACGGGGAGGCGTGACAGGGCCCGGGTCCCTTCTCAGTGGTGCTCTGT 
GCTTCAGGGCAAGCTCCCCGTCTCCGGGCGCACTTCCCTCGCCTGTGTTCGGTCCATCC 

M A G S D 5 

CCTCGCAGGTGGGATCGTCGGTGGGACCGGAGCGCGGGCGGGCGCGGCCCCCCGGGACC ATG GCC GGG TCC GAC 15 

TAPFLSQADDPDDGPVPGTP 2 5 

ACC GCG CCC TTC CTC AGC CAG GCG GAT GAC CCG GAC GAC GGG CCA GTG CCT GGC ACC CCG 7 5 

GLPGSTGNPKSEEPEVPDQE 45 

GGG TTG CCA GGG TCC ACG GGG AAC CCG AAG TCC GAG GAG CCC GAG GTC CCG GAC CAG GAG 13 5 

gjLQRITGLSPGRSALIVAVL 65 

GSp CTG CAG CGC ATC ACC GGC CTG TCT CCC GGC CGT TCG GCT CTC ATA GTG GCG GTG CTG 195 

igYXNLLNYMDRFTVAGVLPD 85 

ife tac atc aat ctc ctg aac tac atg gac cgc ttc acc gtg gct ggc gtc ctt ccc gac 255 

BJeqffnigdsssgliqtvfi 105 

ate gag cag ttc ttc aac atc ggg gac agt agc tct ggg ctc atc cag acc gtg ttc atc 315 

l&SYMVLAPVFGYLGDRYNRK 125 

TCC AGT TAC ATG GTG TTG GCA CCT GTG TTT GGC TAC CTG GGT GAC AGG TAC AAT CGG AAG 375 

•TLMCGGIAFWSLVTLGSSFI 14 5 

TMr CTC ATG TGC GGG GGC ATT GCC TTC TGG TCC CTG GTG ACA CTG GGG TCA TCC TTC ATC 435 

l&GEHFWLLLLTRGLVGVGEA 165 

CCC GGA GAG CAT TTC TGG CTG CTC CTC CTG ACC CGG GGC CTG GTG GGG GTC GGG GAG GCC 495 

SYSTIAPTL IADLFVADQRS 185 

AGT TAT TCC ACC ATC GCG CCC ACT CTC ATT GCC GAC CTC TTT GTG GCC GAC CAG CGG AGC 555 

RMLSIFYFAI PVGSGLGYIA 205 

CGG ATG CTC AGC ATC TTC TAC TTT GCC ATT CCG GTG GGC AGT GGT CTG GGC TAC ATT GCA 615 

GSKVKDMAGDWHWALRVTPG 225 

GGC TCC AAA GTG AAG GAT ATG GCT GGA GAC TGG CAC TGG GCT CTG AGG GTG ACA CCG GGT 67 5 

LGVVAVL L L FLVVR EP PRGA 245 

CTA GGA GTG GTG GCC GTT CTG CTG CTG TTC CTG GTA GTG CGG GAG CCG CCA AGG GGA GCC 73 5 

VERHSDL P PLNPTSWWADLR 265 

GTG GAG CGC CAC TCA GAT TTG CCA CCC CTG AAC CCC ACC TCG TGG TGG GCA GAT CTG AGG 795 

ALARNPS FVLSSLGFTAVAF 285 

GCT CTG GCA AGA AAT CCT AGT TTC GTC CTG TCT TCC CTG GGC TTC ACT GCT GTG GCC TTT 855 

VTGSLALWAPAFLLRSRVVL 305 

GTC ACG GGC TCC CTG GCT CTG TGG GCT CCG GCA TTC CTG CTG CGT TCC CGC GTG GTC CTT 915 
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GETPP CLPGDSCSSSDSLIF 325 

GGG GAG ACC CCA CCC TGC CTT CCC GGA GAC TCC TGC TCT TCC TCT GAC AGT CTC ATC TTT 975 

GLITC LTGVLGVGLGVEISR 345 

GGA CTC ATC ACC TGC CTG ACC GGA GTC CTG GGT GTG GGC CTG GGT GTG GAG ATC AGC CGC 1035 

RLRHSNPRADPLVCATGLLG 365 

CGG CTC CGC CAC TCC AAC CCC CGG GCT GAT CCC CTG GTC TGT GCC ACT GGC CTC CTG GGC 1095 

SAP FLFLSLACAR*GSIVATY 385 

TCT GCA CCC TTC CTC TTC CTG TCC CTT GCC TGC GCC CGT GGT AGC ATC GTG GCC ACT TAT 1155 

jpjpiGETLLSMNWAIVADI 405 

ATT TTC ATC TTC ATT GGA GAG ACC CTC CTG TCC ATG AAC TGG GCC ATC GTG GCC GAC ATT 1215 

LL YVVI PTRRSTAEAFQIVL 425 

CTG CTG TAC GTG GTG ATC CCT ACC CGA CGC TCC ACC GCC GAG GCC TTC CAG ATC GTG CTG 127 5 

SHLLGDAGSPYLIGLISDRL 445 

TpC CAC CTG CTG GGT GAT GCT GGG AGC CCC TAC CTC ATT GGC CTG ATC TCT GAC CGC CTG 133 5 

JrJ R N W P pSFLSEFRALQFSLM 465 

cfff CGG AAC TGG CCC CCC TCC TTC TTG TCC GAG TTC CGG GCT CTG CAG TTC TCG CTC ATG 1395 

^CAFVGALGGAAFLGTAIFI 485 

dlfc TGC GCG TTT GTT GGG GCA CTG GGC GGC GCA GCC TTC CTG GGC ACC GCC ATC TTC ATT 1455 

SaDRRRAQLHVQGLLHEAGS 505 

QAG GCC GAC CGC CGG CGG GCA CAG CTG CAC GTG CAG GGC CTG CTG CAC GAA GCA GGG TCC 1515 

|j DDR IVVPQRGRSTRVPVAS 525 

$k GAC GAC CGG ATT GTG GTG CCC CAG CGG GGC CGC TCC ACC CGC GTG CCC GTG GCC AGT 1575 



^ L I 
GTG CTC ATC TGA 

(^GGCTGCCGCTCACCTACCTGCACATCTGCCACAGCTGGCCCTGGGCCCACCCCACGAAGGGCCTGGGCCTAACCCCT 
TGGCCTGGCCCAGCTTCCAGAGGGACCCTGGGCCGTGTGCCAGCTCCCAGACACTACATGGGTAGCTCAGGGGAGGAGG 
TGGGGGTCCAGGAGGGGGATCCCTCTCCACAGGGGCAGCCCCAAGGGCTCGGTGCTATTl'GTAACGGAATAAAATTTGT 

AGCCAGAAAAAAAAAAAAAAAGGGCGGCCGC 



529 
1587 
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Protein Family / Domain Matches, HMMer version 2 



Searching for complete domains in PFAM 

hmmpfam - search a single seq against HMM database 

HMMER 2.1.1 (Dec 1998) 

Copyright (C) 1992-1998 Washington University School of Medicine 

HMMER is freely distributed under the GNU General Public License (GPL) . 

HMM file: /prod/ddm/seqanal/PFAM/pf am6 . 4/Pf am 

Sequence file : /prod/ddm/wspace/orf anal/oa-script . 9015 . seq 

Query: 4 6455 

Scores for sequence family classification (score includes all domains) : 
Model Description Score E-value N 



sugar_tr Sugar (and other) transporter -63.4 0.00016 

1 

Na_Galacto_symp Sodium :galactoside symporter family -121.2 0.17 
1 

MCT Monocarboxylate transporter -208.2 0.32 

1 

Parsed for domains : 

Model Domain seq-f seq-t hmm-f hmm-t score E-value 



MC T 1/1 60 473 . . 1 611 [] -208.2 0.32 

sugar_tr l/l 58 487 . . 1 488 [] -63.4 0.00016 

Na_Galacto_symp l/l 212 505 . . 1 285 [3 -121.2 0.17 

Alignments of top-scoring domains: 

MCT: domain 1 of 1, from 60 to 473: score -208.2, E = 0.32 

* - >kpPDGGwGWvWf asFl ingf vdGf iksf Gvf f sel lqeet 1 f nesk 
+ +V in + ++++ ++ ++ e++fn+++ 
46455 60 LIVAVLCYINLLNYMDRFTVAGVLPDI EQFFNIGD 94 

sdvdtAwIgS imlavl 1 f sGPlsS i Ivnr f GcRivmiaGgl lagaGl 11a 
s ++ i + + + + + ++P + + 1 + r+ ++ m+ G + + ++ 1 + 
46455 95 S S - - SGL I QT VFI S S YMVLAP VFG YLGDR YNRKYLMCGG I AFW S LVTLGS 142 

sFst. .niwelyltfGvitGlGfgfifqPai.vilgqYFe KrRsl 

sF + + + + w+l It G ++G+G + + + + ++ + + F +++++ +s+ 
46455 143 SFI PgeHFWLLLLTRG- LVGVGEA- SYST IApTLlADLFVadqrsRMLS I 190 

At G i AvaG sGvGtwfpp 1 1 q f 1 i dnyGs DW rga llilggillncvicGa 
+ GsG+G +++ ++ d G DW +al+++ g+ + v++ + 

46455 191 F YFAI PVGSGLGY I AGS KVKDMAG - DWHWALRVTPGLG WAVLLLF 23 5 

lllRPlepsvpqdekdkeqetlkeakkkkendtettkeeteplkslpkas 
1 +R+ +p++ ++ 
46455 236 LWRE PPRGAVER 248 



46455 



46455 



46455 



ilkledakaersvdsLlsskSvgerdksqlsekqksqasgrpsssatavq 



IvllrsrlekadlplkrvrvsrrrvlSkVSaeSgtdgersSgylNrkdvF 
+ s + l 

249 HSDL 252 

YtGsisNvaefkedpdkYrssslhgtrttvgnaesqstlrlddsresgdg 
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46455 



dsssedlsektrgdggkkessskeiretikkllDf svlk.nrtFll . . . . 

++++ + + D++ 1 +n++F+l++ + 
253 PPLNPTS WWADLRALArNPS FVLs s lg 279 



.yaisnlfaslGf fvPlvf LvsYaikslg IdekeAsf Lis 

a++++ sl++ +P++ L s++ lg++++ +++ + ++++ s+ + 
46455 280 f TAVAFVTGS LALWAPAFLLRSRV - - VLGe t pp c 1 pgds CS S SD - S L I FG 326 

iiGvsnivGRpifGlvADkkgvrpTarhivyifnlsllal . .GlttlacP 
i + v +++G+ + +++ r + ++ 11 ++1++ + + 
46455 327 LITCLTGVLGVGLGVEISRRLRHSNPRADPLVCATGLLGSapFLFLSLAC 376 

latsf wgLwycilFGf s . iGsygaLtfwLvdLvgWlek f snA 

+s++ ++ + i + G + +++a+ + +L v + +++ + + f + 
46455 377 ARGSIVATYIF- IFIGET1LSMNWAIVADILLYWI - PTRrstaeaFQIV 424 

fGllllfeGvavLvGPPi . . . . aGlLvDakttgdYtvaFyf sGillllsg 
++11 G + L+G ++++ + ++++ + ++ ++ ++++ 

46455 425 LSHLLGDAGSPYLIGL ISdr 1 rRNWPPSF - - LSEFRALQFSLMLCAFVGA 472 

!<-* 
1 

46455 473 L 473 

sugar_tr: domain 1 of 1, from 58 to 487: score -63.4, E = 0.00016 

*- >valvaalgGgf If GyDtgviggf lalidf If rfglltssgalaelvg 
al++a+ + + + ++++ ++ + f+ + +s+ 
4 6455 58 SAL I VAVL C - Y I NLL.N YMDR FT VAGVL P D I E Q F FN I GDS S 96 

ystvltglwsif f lGrliGslf aGklgdrf GRkksllialvlfviGall 
+gl+ ++f+ + + + ++G+lgdr+ Rk+ + + + + + + +1+ 
46455 97 S GL I QT VF I S S YMVLAP VFG YLGDR YNR K YL MCGG I AF WS L VTLG 141 

sgaapgytTiGlwafyllivGRvlvGlgvGgasvlvPmYisEiAPkalRG 
s ++pg +f+ll++ R IvG g s ++P++i+ + R 

46455 142 SSFIPGE HFWLLLLTRGLVGVGEASYSTIAPTLIADLFVADQRS 185 

algslyqlaitiGilvAaiiglglnktnndsalnswgWRiplglqlvpal 
+ + s++ +ai +G +++i g + + +++d +w R+ gl+ v 1 
46455 186 RMLS I F YFAI PVGSGLGYIAGS KVKDMAGD W HWALR VTPGLG WAVL 232 

llligllf lPESPRwLvekgkleeArevLaklrgvedvdqeiqeikaele 
11++++ P rg + + ++ + +++ 

46455 233 LLFLWREPP RGAVERHSDLPPLNPTSW 260 

atvseekagkaswgelf rgrtrpkvrqrllmgvmlqaf qQltGiNaif YY 
+ + + 1 r+++ +1 + + +a+ +tG ++ + 

46455 261 WA DLRALARNPS FVLSSLGFTAVAFVTG- -SLALW 293 

sptif ks vGvsds va s 1 lvt i i vgwNf vf Tf vaLif lvD 

+p ++ ++ +++++ +ds +s ++i+g+++ ++ + + + 1 
46455 294 APAFLLRsrwlge tppCLPGDS - CSS SDSLI FGL I TCLTGVLG- VGLGV 341 

rfGRR pllllGaagmaicf lilgasigvallllnkpkdpss 

+ RR ++++++ +pl++ ++ ++ fl+l+ 1++ ++ 
46455 342 E I SRR1 rhsnpradPLVCATGLLGSAPFLFLS LACARGS 380 

kaagivaivf illf iaf FalgwGpipwvilsElFPtkv . . . .Rskalala 
iv++++fi+ + + + w+i++++ v +++Rs+a a+ 
46455 381 IVATYIFIFIGE-TLLSMNWAIVADILLYWiptrRSTAEAFQ 422 

taanwlanf iigf lfpyitgaigl . . . . alggyvf Ivf agl . . .lvlfil 
++ 1 + + + py+ g i+ + ++++++++ f +1+ +1 1+ + 
46455 423 IVLSHLLGDAGS PYLIGLISDrlrrNWPPSFLSEFRALqf sLMLCAF 469 
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fvf f fvPETkGrtLEeieelf <-* 
++ ++ ++ G +i 
46455 470 VGALGGAAFLG TA I F I EA 487 



Na__Galacto__symp: domain 1 of 1, from 212 to 505: score -121.2, E = 0.17 
*->qlG.yf f falV. . . LslagwllwiCf . . . . fgtkEvySssdtreng 
+ G++++++ V+++L++++V+11+++ +++++g E+ sd ++ + 
46455 212 MAGdWHWALRVtpgLGWAVLLLFLWreppRGAVERH- -SDLPPLN 256 

qkttsllqslkllakNdQ . . LliLclaalf yllainilgg . aqlYYvtYv 
++ ++l++la+N++ L L++ a + ++ +1 ++a 1 + v 

46455 257 PTSW- - WADLRALARNPS f vLSSLGFTAVAFVTGSLALWApAFLLRSRVV 304 

LG.dpelFs ylllynilvgligslLf PrLvkrf . .gkktv 

LG++p +++++ ++++ + +++1+ +1 g g+ L + + r+ + + +- + + 
46455 305 LGeTPPCLPgdscsssdsl IFGLITCLTGVLGVGLGVEISRRLrhSNPRA 354 

FagcivlmvlgslliFf vagsslal . ilvlif lagilqqlvtllvWvlQV 
+ ++lg ++ F++ +++ a + +v+ + + + + + + +W++ 
46455 355 DPLVCATGLLG-SAPFLFLSLACARgSIVATYIFIFIGETLLSMNWAI - - 401 

IMvsDt VDYGEwktGvRl EG1 vyS vf 1 f vl K1G1 Al sGa 1 vGwi L . . gyi 
v+D+ Y t +R+ + + ++ 1 1 1G A s 1+G + i ++ 

46455 402 - - VADILLYWT PT - RRSTAEAFQ IVLSHL - LGDAGS PYLIGLISdrLRR 447 

GYvanasqststalgQlvf ilalFalPpallllaaf imlrf YkLtekkla 
+ ++ s al+ f 1 1 a++ al +a + ++ f+ + + + 
46455 448 NWPPSFL-SEFRALQ FSLMLCAFVGALGGAAFLGTAI FIEADRRRAQ 493 

eIveeLekWrtrkrk<-* 

V L+ + 4- 

46455 494 LHVQGLL HEAGS 505 
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CLUSTAL W (1.74) multiple sequence alignment 



Fbh464 55FL 
292825 



MAGSDTAPFLSQADDPDDGPVPGTPGLPGSTGNPKSEEPEVPDQEGLQRITGLSP G- 

MVRNKVAPVEDGANIQRNFEPP- -P- -PYTT- -P-TDSPEDKIRSNSTATTASQPEFQGC 
* ** *• ***.** :;.** ;.. *. .* * 



Fbh46455FL 
292825 



?bh46455FL 
5928r2 5 



R^^LIVAVLCYINLLNYNpRFTVAGVLPDIEQFFNIGDSSSpLIQTVFISSYMVLAPVFG 
WTI VWAI LFI INLLNYMDRYTI AGVLNDVQTYYNI SDAWAGLIQTTFMVFF 1 1 FS PI CG 
..**.* ********* : * : **** ::**.*: : * * * * * ^ * ; .....*. * 

YlJGDRYNRK^LMCGGIAFWSLVTLGSSFIjPGEHFWLLLLTRGLVGVGE^SYSTIAPTLIA 
FLGDRYNRKWI FWGI AI WVSAVFASTFI PSNQFWLFLLFRGIVGIGEASYAI I S PTVIA 
.********- : : ***.* ..:.*:***.::***:** ** : ** : ***** : *.★*-** 



?bh4«4 55FL 
592811 5 



?"bh%g455FL 
S92825 



? bh4£455FL 
!;92i25 



DLFVApQRSRI^LS I FY FA I P VG S G LG Y I Pjp S KVKDMAGD WI^WALRVT PGLG WAVL L L F L 
DMFTGVLRSRMLMVFYFAI PFGCGLGFWGSAVASWTGHWQWGVRVTGVLGI VCLLLI IV 
* . * ***** .****** # *^*** :: .** * . - * ^ * : * ^ ; * * * 

WtREPPRGAVERHS - DLPPLNP - TSVTOADLRALARNPS^LSSLGFTAVAFVTGSLALW^ 
FWEPERGKAEREKGEIAASTEATSYLDDMKDLLSNATYVTSSLGYTATVFMVGTLAWW^ 
* * * * ** ** - . **- *:: * * . : : * * * 

PAFLLRSRVVLGETPPCLPGDSCSSSDSpjIFGLITCLTGVLiGVGLGVfEIS RR L 

PITIQYADSAR - RNGTITE -DQKANIN- LVFGALTCVGGVLGVAIGTLVSNMWSRGVGPF 

. ^ ^ .*.**.**.***★*-* -* * ; 



^bh46455FL 
592825 



bh46455FL 
592825 



'bh46455FL 
.92825 



'bh46455FL 
192825 



RHSNP - RADflLVCATGLLGSAPFLFLSLAGP^GS I VAT^/I FI FIGETLLSMNWAI VADIL, 
KHI QTVRADALiVCAI GAA I C I PTL I LA I QN I E SNMNFAWGMLF I C I VAS S FNWATNVDLL 
. * . ******** .**:*:: ::::** 



* . * * * * . * 



LYVWrPTRRSTAEAFQ{rVLSHLLGDAGSPYLIGLIlSDRLRRNWPPSFLSEFRALQ(FSLML 
LS VWPQRRSSASS WQ I L I SHMFGDASGP Y I LGL I S DAI RGNED - TAQAHYKS LVTS FWL 
* **.* ***.* ..**..**.;***..**:;***** :* * : 



... * * . * 



CAFVGALGGAAFLGTAI F LBADRRRAQLHVQGLLHEAGSTDDRIVVPQRGRSTRVPVASV 
CVGTLVLS V ILFGISAI TWKD KARFNE I MLAQANKDNTS SG - - TLP I EDRNTEDETGS E 
** * ; ** : *.* : : .:*..*.*. ..* 

LI-- 
VQHM 
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Input file 54414; Output File 54414. tra 
Sequence length 463 2 

CACGCGTCCGCCCACGCGTCCGCCCACGCGTCCGAG^ 
CCCCTCCCTGTCCCCTTCCCCTTCTCCCATCC^ 

M V D 3 

TATTTTCTTTCTTTCTCCCTCCTCT ATG GTT GAT 9 

LESEVPPLPPRYRFRDLLLG 23 

TTG GAG AGC GAA GTG CCC CCT CTG CCT CCC AGG TAC AGG TTT CGA GAT TTG CTG CTA GGG 69 

DQGWQNDDRVQVEFYMNENT 43 

GAC CAA GGA TGG CAA AAC GAC GAC AGG GTA CAA GTT GAA TTC TAT ATG AAT GAA AAT AC A 129 

FKERLKLFFIKNQRSSLRIR 63 

;; TTT AAA GAA AGA CTA AAA TTA TTT TTC ATA AAA AAC CAG AGA TCA AGT CTA AGG ATA CGC 189 

Q^FNFSLKLLSCLLYIIRVLL 83 

&TC TTC AAT TTT TCT CTC AAA TTA CTA AGC TGC TTA TTA TAC ATA ATC CGA GTA CTA CTA 24 9 

jpENPSQGNEWSHIFWVNRSLP 103 

gAA AAC CCT TCA CAA GGA AAT GAA TGG TCT CAT ATC TTT TGG GTG AAC AGA AGT CTA CCT 309 

i;HLwGLQVSVAL I SLF ET I LLG 123 

*HTG TGG GGC TTA CAG GTT TCA GTG GCA TTG ATA AGT CTG TTT GAA ACA ATA TTA CTT GGT 369 

MY LSYKGNIWEQILRIPFILE 143 

CTT AGT TAT AAG GGA AAC ATC TGG GAA CAG ATT TTA CGA ATA CCC TTC ATC TTG GAA 429 

%.a I NAVPF I I S I FWP SLRNLF 163 

^TA ATT AAT GCA GTT CCC TTC ATT ATC TCA ATA TTC TGG CCT TCC TTA AGG AAT CTA TTT 489 

PVFLNCWLAKHALDNMIND 183 

GTC CCA GTC TTT CTG AAC TGT TGG CTT GCC AAA CAT GCC TTG GAT AAT ATG ATT AAT GAT 549 

LHRAIQRTQSAMFNQVLILI 203 

CTA CAC AGA GCC ATT CAG CGT ACA CAG TCT GCA ATG TTT AAT CAA GTT TTG ATT TTA ATA 609 

STLLCLIFTCICGIQHLERI 223 

TCT ACA TTA CTA TGC CTT ATC TTC ACC TGC ATT TGT GGG ATC CAA CAT CTG GAA CGA ATA 669 

GKRLNLFDSLYFC IVTFSTV 243 

GGA AAG AGG CTG AAT C£C TTT GAC TCC CTT TAT TTC TGC ATT GTG ACG TTT TCT ACT GTG 729 

G FGDVTP ETWS SKL FVVAMI 263 

GGC TTC GGG GAT GTC ACT CCT GAA ACA TGG TCC TCC AAG CTT TTT GTA GTT GCT ATG ATT 789 

CVALVVLPIQFEQLAYLWME 283 

TGT GTT GCT CTT GTG GTT CTA CCC ATA CAG TTT GAA CAG CTG GCT TAT TTG TGG ATG GAG 849 

RQKSGGNYS RHRAQTEKHVV 303 

AGA CAA AAG TCA GGA GGA AAC TAT AGT CGA CAT AGA GCT CAA ACT GAA AAG CAT GTC GTC 909 

LCVSSLK I DLLMDFLNEFYA 323 

CTG TGT GTC AGC TCA CTG AAG ATT GAT TTA CTT ATG GAT TTT TTA AAT GAA TTC TAT GCT 969 
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H P R L Q D Y 
CAT CCT AGG CTC CAG GAT TAT 

V R R V L Q I 
GTT CGA AGG GTA CTG CAG ATT 

A L K D Q D L 
GCC CTT AAA GAT CAA GAC CTA 

L S S R C E V 
CTC AGT AGC CGT TGT GAA GTG 

W A V K D F A 
TGG GCT GTG AAA GAT TTT GCT 

. N K F H I K F 
HiPJV AAA TTT CAC ATC AAA TTT 

Dm l a l n c i 

fgTG TTA GCT TTA AAC TGT ATA 

St s r g q e g 

SaCC TCT AGA GGG CAA GAA GGC 

S G N E V Y 
-TGC TCC GGG AAT GAA GTC TAC 

flJE G K S F T Y 
ygAA GGA AAG AGT TTT ACA TAT 

I? I G V R R E D 
; r ~-A.TT GGT GTT AGG AGG GAG GAT 

M N S T D I C 
ATG AAT TCT ACG GAC ATA TGC 

K N Q D Q Q R 
AAA AAC CAA GAC CAG CAG AGA 

R L P V H S I 
AGA TTA CCT GTA CAT AGC ATA 

T S C R S A S 
ACA AGC TGT AGA TCA GCA AGT 

I R R P S I A 
ATA AGA AGA CCT AGC ATT GCT 

C D L L S D Q 
TGT GAT CTT CTA AGT GAC CAA 

N L E Y A K G 
AAC TTA GAG TAT GCT AAA GGT 

F C H L L H E 
TTT TGT CAT CTC CTT CAT GAA 



Y V V I L C P 
TAT GTG GTG ATT TTG TGT CCT 

P M W S Q R V 
CCA ATG TGG TCC CAA CGA GTT 

L R A K M D D 
TTG AGA GCA AAG ATG GAT GAC 

D R T S S D H 
GAT AGG ACA TCA TCT GAT CAC 

3? N C P L Y V 
CCA AAT TGT CCT TTG TAT GTC 

A D H V V C E 
GCT GAT CAT GTT GTT TGT GAA 

C P A T S T L 
TGC CCA GCA ACA TCT ACA CTT 

Q Q S P E Q W 
CAG CAA TCG CCA GAA CAA TGG 

H I V L E E S 
CAC ATT GTT TTG GAA GAA AGT 

A S F H A H K 
GCC TCT TTC CAT GCA CAC AAA 

N K N I L L N 
AAT AAA AAC ATT TTG CTG AAT 

F Y I N I T K 
TTT TAT ATT AAT ATT ACC AAA 

K S N V S R S 
AAA AGC AAT GTG TCC AGG TCG 

I A S M G T V 
ATT GCC AGC ATG GGT ACT GTG 

G P T L S L P 
GGC CCT ACC CTG TCT CTT CCT 

P V L E V A D 
CCT GTT TTA GAG GTT GCA GAT 

S E D E T T P 
TCA GAA GAT GAA ACT ACA CCA 

Y P P Y S P Y 
TAC CCA CCT TAT TCT CCA TAT 

K V P F C C L 
AAA GTA CCA TTT TGC TGC TTA 



T E M D V Q 343 

ACT GAA ATG GAT GTA CAG 1029 

I Y L Q G S 363 

ATC TAC CTT CAA GGT TCA 1089 

A E A C F I 383 

GCT GAG GCC TGT TTT ATT 1149 

Q T I L R A 403 

CAA ACA ATT TTG AGA GCA 12 09 

Q I L K P E 423 

CAG ATA TTA AAG CCT GAA 1269 

E E F K Y A 443 

GAA GAG TTT AAA TAC GCC 1329 

I T L L V H 463 

ATT ACA CTA CTG GTT CAT 13 89 

Q K M Y G R 483 

CAG AAG ATG TAC GGT AGA 1449 

T F F A E Y 503 

ACA TTT TTT GCT GAA TAT 1509 

K F G V C L 523 

AAG TTT GGC GTC TGC TTG 1569 

P G P R Y I 543 

CCA GGT CCT CGA TAC ATT 1629 

E E N S A F 563 

GAA GAG AAT TCA GCA TTT 1689 

F Y H G P S 583 

TTT TAT CAT GGA CCT TCC 174 9 

A I D L Q D 603 

GCT ATA GAC CTG CAA GAT 1809 

T E G S K E 623 

ACA GAG GGA AGC AAA GAA 1869 

T S S I Q T 643 

ACA TCA TCG ATT CAA ACA 1929 

D E E M S S 663 

GAT GAA GAA ATG TCT TCA 1989 

I G S S P T 683 

ATA GGA AGT TCA CCC ACT 2049 

R L D K S C 703 

AGA TTA GAC AAG AGT TGC 2109 
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QHNYYEDAKAYGFKNKLIIv 723 

CAA CAT AAC TAC TAT GAG GAT GCA AAA GCC TAT GGA TTC AAA AAT AAA CTA ATT ATA GTT 2169 

AAETAGNGLY FIVPLRAYY 743 

GCA. GCT GAA AC A GCT GGA AAT GGA TTA TAT AAC TTT ATT GTT CCT CTC AGG GCA TAT TAT 2229 

RPK KELNPIVLLLDNPLDDL 763 

AGA CCA AAG AAA GAA CTT AAT CCC ATA GTA CTG CTA TTG GAT AAC CCC CTA GAT GAC TTA 228 9 

LRCGVTFAANMVVVDKESTM 783 

CTC AGG TGT GGA GTG ACT TTT GCT GCT AAT ATG GTG GTT GTG GAT AAA GAG AGC ACC ATG 234 9 

SAEEDYMADAKTIVNVQTLF 803 

AGT GCC GAG GAA GAC TAC ATG GCA GAT GCC AAA ACC ATT GTG AAC GTG CAG ACA CTC TTC 2409 

RLF SSLSIITELTHPANMRF 823 

AGG TTG TTT TCC AGT CTC AGT ATT ATC ACA GAG CTA ACT CAC CCC GCC AAC ATG AGA TTC 2469 

O m QFRAKDCYSLALSKLEKKE 843 

ggiTG CAA TTC AGA GCC AAA GAC TGT TAC TCT CTT GCT CTT TCA AAA CTG GAA AAG AAA GAA 2529 

^RERGSNLAFMFRLPFAAGRV 863 

ijGG GAG AGA GGC TCT AAC TTG GCC TTT ATG TTT CGA CTG CCT TTT GCT GCT GGG AGG GTG 2589 

l UF SISMLDTLLYQSFVKDYMI 883 

ydhTT AGC ATC AGT ATG TTG GAC ACT CTG CTG TAT CAG TCA TTT GTG AAG GAT TAT ATG ATT 2649 

M= S ITRLLLGLDTTPGSGFLCS 903 

fffCT ATC ACG AGA CTT CTG TTG GGA CTG GAC ACT ACA CCA GGA TCT GGG TTT CTT TGT TCT 2709 

;H M KITADDLWIRTYARLYQKL 923 

jjLTG AAA ATC ACT GCA GAT GAC TTA TGG ATC AGA ACT TAT GCC AGA CTT TAT CAG AAG TTG 2769 

MC SSTGDVPIGIYRTESQKLT 943 

TGT TCT TCT ACT GGA GAT GTT CCC ATT GGA ATC TAC AGG ACT GAG TCT CAG AAA CTT ACT 2829 

TSESRKIASQSQISISVEEW 963 

ACA TCT GAG TCT CGA AAA ATA GCA TCA CAA TCT CAA ATA TCT ATC AGT GTA GAA GAG TGG 2889 

EDTKDS KEQGHHRSNHRNST 983 

GAA GAC ACC AAA GAC TCC AAA GAA CAA GGG CAC CAC CGC AGC AAC CAC CGC AAC TCA ACA 2949 

SSDQSDHPLLRRKSMQWARR 1003 

TCC AGT GAC CAG TCG GAC CAT CCC TTG CTG CGG AGA AAA AGC ATG CAG TGG GCC CGA AGA 3009 

LSRKGPKHSGKTAEKITQQR 1023 

CTG AGC AGA AAA GGC CCA AAA CAC TCT GGT AAA ACA GCT GAA AAA ATA ACC CAG CAG CGA 3 069 

LNLYRRSERQELAELVKNRM 1043 

CTG AAC CTC TAC AGG AGG TCA GAA AGA CAA GAG CTT GCT GAA CTT GTG AAA AAT AGA ATG 3129 

KHLGLSTVGYDEMNDHQSTL 1063 

AAA CAC TTG GGT CTT TCT ACA GTG GGA TAT GAT GAA ATG AAT GAT CAT CAA AGT ACC CTC 318 9 

SYILINPS PDTRI ELNDVVY 1083 

TCC TAC ATC CTG ATT AAC CCA TCT CCA GAT ACC AGA ATA GAG CTG AAT GAT GTT GTA TAC 3 24 9 
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LIRPDPLAYLPNSEPSRRNS 1103 
TTA ATT CGA CCA GAT CCA CTG GCC TAC CTT CCA AAC AGT GAG CCC AGT CGA AGA AAC AGC 3 3 09 

ICNVTGQDSREETQL* 1119 
ATC TGC AAT GTC ACT GGT CAA GAT TCT CGG GAG GAA ACT CAA CTT TGA 3 357 

TAAAAATAAAATGAGAAACTTTTTTCCTACAAAGACCTTGCTTGAAACCAC 

GATGGAAATATATGTAATTCTCTCATATTTAAAAACGTA^ 

GTACTACTTACTGGTACTCTCCCTATTAATATTTGAAGGACCTCA 

AAAATTTAAATCTGACATTTAATTGTTTTATAATAATCCA 

TGAAGTTGACAAAATCTAACTATATTTGGTGCATCACAATGGAC^^ 
3 GTCATATTATATTCTTTAAACTTACTGTTTTACAAAATTGAGCTCATC 
CACCAACAAACTTGTGTGGCTGACTTTTC^ 
;y|TTTTTTTCTGCCTTACGATATAAAAATAT^ 
3gTAAAACATAAATGAAAAGAAAC^^ 
I ^AAGCATACTATAAAGCAAATATCTATTATTCTC 
-TGACTTAAATTTAATTCAAGGAT 

^TTTATACCTTTTATGGACTCTGAAGACACT^ 

SIgangatgtattaaattttgactt 
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Protein Family / Domain Matches, HMMer version 2 

Searching for complete domains in PFAM 

hmmpfam - search a single seq against HMM database 

HMMER 2.1.1 (Dec 1998) 

Copyright (C) 1992-1998 Washington University School of Medicine 

HMMER is freely distributed under the GNU General Public License (GPL) . 



HMM file: /prod/ddm/seqanal/PFAM/pf am6 . 4/Pf am 

Sequence * file : /prod/ddm/wspace/orf anal /oa- script . 3743 . seq 



Query: Fbh54414 

Scores for sequence family classification (score includes all domains) : 
Model Description Score E-value N 



ion_trans Ion transport protein 
1 



62.4 



9.9e-15 



Parsed for domains : 

Model Domain seq-f seq-t 



ion trans l/l 



104 277 



hmm-f hmm-t 

1 223 [] 



score E-value 
62.4 9.9e-15 



Alignments of top-scoring domains: 

ion_trans: domain 1 of 1, from 104 to 277: score 62.4, E = 9.9e-15 

*->ilf ildllfvllf lleivlkf iayglkstsniaakylksif nildll 
++++ ++ + ++ + i + l + +y + + + + + +++i +il ++ 

Fbh54414 104 LWGLQVSVALISLFETILLGYLSYKGN IWEQILRIPFILEI I 145 

ailplllllvlflsgteqvakkrlrerf slelsqwyyrilrf lrlLrllR 
++p++++++ + + + 1 + ++L ++ 

146 NAVPFIISIFWPSLRN LFVPVFLNCW- 171 



Fbh54414 



Fbh54414 



Fbh54414 



Fbh54414 



lLrllrllrrletlf ef elgtlaWslqslgralksilrf 11111111 igf 
1 + + +++ + +1 + ++r+ + ++++ +1+1+ 1++ 
172 LAKHALENM INDL HRAI QRTQSAMFNQVL I L I STLLCL 209 

svigyllfkgyedlsenevdgnsefssyfdafyflfvtlttvGfGdlvpv 
++ + +++e + ++ ++fd++yf ++vt++tvGfGd++p+ 

210 IFTCICGIQHLER IGKRLNLFDSLYFCIVTFSTVGFGDVTPE 251 

. wlgiif fvlf f iivgllllnlliavi<-* 
+w+ + + + f+. + + i +v +l ++l + + + 
252 tWSSKLFV-VAMICVALWLPIQFEQL 277 
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CLUSTAL. W (1.74) multiple sequence alignment 



54414. prot 
AF089730 



54414. prot 
AF089730 



54414. prot 
AF089730 



54414. prot 
AF089730 



544 14. prot 
AE08973O 



54*14. prot 
APd89730 



MVDLESEVPPLPPR YRFRDL — LLGDEGWQN 

MARAKL PRS P S EGKAG PGDTPAGSAAPEEPHGLS PLL PTRGGGSVGSDVGQRLHVEDFSL 

* * * * *: * *.:. 

T7M 



DD RVQVEFYMNENTFKERIjKLFF IKNQRSSLR IF LFNFSLKLLSCLLYI IRVLI ENP 

DS SLSOV0VEFYVNE^^IT7CERIJKliFF I KNQRSSLR II ^FT?FSIJG^LTCXJLiYIVRVIjj DNP 
* .******.*********************************:*****:****:** 



SQGN _ _ _ _ ewshifwvnrsl^wglqvsvalislfetili^ylsy'| 

iYig y 



Df^TfyTdf^ TKYNYTFNQSSS EFHWAP ILWVERKM ^jWVTQVrVATI SFTjETMLLI^i 

*. *.**.* :.** :** ** **..**.** **** 

^ENMINDLHRA 
IrENMIMDFHRA 
t* **'***.*******************:*** 
Pore 



KGNIWEQ CLRIPFILEIINAVPFIIS 
KGKIWEQ CFHVSI-\ T LEMINTLPFII'I 



3EFWPSLF NLFVPVFI^CWIJCai^ 
WFWPPU NLFIPV 7 FIJNO^?l4^ <HALE 



*****★**... *.*★.** 



IQF^SAMFN^ILISTIJX^ 1.FDSLYFCXVTFSTVGFGD j=>=>V^-J cJ<*~ eA 

IIJ ^SAMFNg^ILFCTI^LVFTGTC^ f^^^ ^ ^v^» \^ 

* *************. *****.** ******** * ***. *.************** 3 



* *************.*****.** ******** * .***: * : ************** 
W^lfiPSQLLVVTL^ 

***. * *.*.** .***.*★***.***.★ ******************;********** 



5*414 .prot SLKIDLI2C>FI2tfEFYAHPRIjQDYY 

AEQ897 3 0 SIJ<XDIJJMDFI2JEFYAHP^ 

L ^ F ****************************** : ******* ******:*************** 

5|ll4 . prot QDLLRAKMDDAEACFILSSRCEV^ 

AF£)897 3 0 QDU"IRAKMIXK3^CFILSSRNEVD^^ 

;Ljl ***.*****.********* *****.-******************************** 

54414 . prot IKFAEHVVCEEEFKYAMLALNC IC PATSTL ITIXVHTSRGQBG<^SPEQWK^^CSGN 

A*S)897 3 0 VKFADHVVCEEECKYAMLALNC IC PATSTL ITI^VHTSRGQBGQESPEQWQRMYGRCSGN 

: f™ .*********** *******************************:******:******** 

54414 . prot EVYHIVLEESTFFAEYBGKSFTYASFHAHKKFGMC^ 

AF089730 EVYHIRMGDSKFFREYBG^FTYAAFHAHKKYGVC^ 

***** . .* ** **********.******.****** : : ** : **.********:* : :: 



54414 ..prot DICFYINITKEENSA — FKNQDQQRKSNVS- - SFYHGPSRLPVHS I IASMGTVAIDLQDT 

AF089730 i?iCFxlOTTKEENSAFIFKQEEK<^^ 

* ************* **....*. : 



,.* ************** **.**★.* 



54414 . prot SCRSASGPT I^LPTEGSKEIRRPSIAFVLWADTSSIQ^ 

AFO 89730 IXZ31PSQGGSGGGGGKIjTLPTEM3SGSRRPS IAPVLELAISSAIJJPC1)IJ-^DQSEDEVTPS 

*.★*** **********.**.*.- ************* 



* * . * - 



544 14 . prot DEE^ISSNXJETYAKGYPPYSPYIGSSPT^ 
AF0897 30 D DEX^3WEYVKGYPFNSPYIGSSPT^^ 

*.* * .** ***** *********.**** ********** m * : ** ********** 

54414 .prot NKL I IVAAETAGNGLYNF IVPIJ^YYRPKKELNPIVI^IJ^J- P 

AF08973 0 NKLI IVSAETAGNGLYNF IVPLRAYYRSRRELNP IVLXiL^^ 

******.******************** ..*********** * 



544 14. prot 
AF089730 



LDDLLRCX3VT F AANMVVVDKEST^ISAKEDYMADAKT IVNVQTLFRLFS StS I IT 

EGSVDNLDSLLQCGI I YAD^VVVDKESTTISAI^^ ITT 



************* 
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W MINI mi" 



544 14 . prot ELTHPANMRFMQFTU>dCD^ 

AF0897 30 ELTHPSNMRFMQFRAKDSYSLALSK^ ISMLDTLL 

*****.*********************:***.*************************** 

54414 .prot YQSFVKDYMISITRLLI^IXrTTPG 
AF089730 YQSFVKDYMITITRI^IJSIJ^ 

**********.************* *★.**:**:* ****★*♦*,★*:******:.;:*** 



544 14 . prot I YRTESQKLTTS ESRK IASQSQI S ISVEEWEDTKDSK EQGHHRSNHRNS TSSD 

AF0897 3 0 I YRTBCH - VFS S EPHDLRAQSQ I SVNMEDC EITn^EAKG PWGTRAASGGGSTHGRHGGSAD 

**♦**. : .** . . .*****..*. *** ::; * 



* . * . * 



544 14 . pro t QSDHPLI^RRKS^^QWARRLSRKGPKHSGK - - -TAEKITQQIU^YRRSERQEIJ^LVKNRM 

AF089730 PVEHPLLRI^SLQWARKLSRKSSKQAGKAPMT^^ 

.********.****.****_*;;** * :: ******.**********;******* 



54414. prot 
AF089730 



KHLGLSTVGY DEMND- HQSTLS YIL XNPS PDTR I ELNDV 

KHI^LPTTCYEDVA^TASD\m^ 

***** * ** ***** **_****.****.****;* ** : 



544 14 . prot VYL IRPDPLAYLPNSEPSRRNS ICNVTG QDSREETQL 

AF089730 VYL IRSDPLAHVTSSSQSRKSSCSNKLSSCNPETRDETQL 

***** ****.. * **- * * ..*.**** 
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Input file Fbh53763pat .seq; Output File FbhS37 63pat - t ra 
Sequence length 2847 

CCACGCGTCCGGCCCTGTGCTTCGGATGGCGGCGGGAGGTTGATGGCGAGTGGTGCTGAAGGGACAGCTCCAGCAGTGG 
CTGATTTGGGGGAGAAACAAAATCTGCAGATGGAATCCGAGCAGGGCGACTTCACCTTCAAGTGG 

CTGCGGCCAGTCTCCACTCCATTCACGGCCAGCCGATCTGCCCGCTCCCGGAGGGGTCGGGCAGTGCCGGCTGGACCCG 

CCCCGAGCTCCATGGTTTGCCCAACCCTGCGCGATGGTGACTCTGGGCGCGGAGGTTGGCGACTGGCAAATCCGCAGAT 

CACAGAATGAAGGCGGGGAGCGCGGCCGGCGGCCGGCGGGGGCTTTCTCCCCCACCCCAGCGCCCAGGGAAGCGGCTCA 

ACCACCTGAATCCGGAAAACGCCAACAAGTAGTTTCTCGTCGGAGAAGGGCGGCTCACCTGGGCGCCAAGACTCAGTCC 

CGCTGCCCAGAGAACCTCGTCCACTCGGAAACCAAAGCAGAACCACTTTTCTCTCGG 

MGKIENNERVILNVGGTR 18 

feiACAGAG ATG GGC AAG ATC GAG AAC AAC GAG AGG GTG ATC CTC AAT GTC GGG GGC ACC CGG 54 

QiETYRSTLKTLPGTRLALLA 38 

f|AC GAA ACC TAC CGC AGC ACC CTC AAG ACC CTG CCT GGA ACA CGC CTG GCC CTT CTT GCC 114 

JsSEPPGDCLTTAGDKLQPSP 58 

Jfec TCC GAG CCC CCA GGC GAC TGC TTG ACC ACG GCG GGC GAC AAG CTG CAG CCG TCG CCG 174 

W>PLSPPPRAPPLSPGPGGCF 78 

€CT CCA CTG TCG CCG CCG CCG AGA GCG CCC CCG CTG TCC CCC GGG CCA GGC GGC TGC TTC 234 

kjSGGAGNCSSRGGRASDHPGG 98 

t3te GGC GGC GCG GGC AAC TGC AGT TCC CGC GGC GGC AGG GCC AGC GAC CAT CCC GGT GGC 294 

jlREFFFDRHPGVFAYVLNYY 118 

CGC GAG TTC TTC TTC GAC CGG CAC CCG GGC GTC TTC GCC TAT GTG CTC AAT TAC TAC 354 

RT GKLHCPADVCGPLFEEEL 138 

CGC ACC GGC AAG CTG CAC TGC CCC GCA GAC GTG TGC GGG CCG CTC TTC GAG GAG GAG CTG 414 

AFWGIDETDVEPCCWMTYRO 158 

GCC TTC TGG GGC ATC GAC GAG ACC GAC GTG GAG CCC TGC TGC TGG ATG ACC TAC CGG CAG 474 

HRDAEEALDI FETPDLIGGD 178 

CAC CGC GAC GCC GAG GAG GCG CTG GAC ATC TTC GAG ACC CCC GAC CTC ATT GGC GGC GAC 534 

PGDDEDLAAKRLGI EDAAGL 198 

CCC GGC GAC GAC GAG GAC CTG GCG GCC AAG AGG CTG GGC ATC GAG GAC GCG GCG GGG CTC 594 

GGPDGKSGRWRRLQPRMWAL 218 

GGG GGC CCC GAC GGC AAA TCT GGC CGC TGG AGG AGG CTG CAG CCC CGC ATG TGG GCC CTC 654 

FEDPYSSRAARFIAFASLFF 238 

TTC GAA GAC CCC TAC TCG TCC AGA GCC GCC AGG TTT ATT GCT TTT GCT TCT TTA TTC TTC 714 

ILVSITTFCLETHEAFNIVK 258 

ATC CTG GTT TCA ATT ACA ACT TTT TGC CTG GAA ACA CAT GAA GCT TTC AAT ATT GTT AAA 774 

NKTEPVINGTSVVLQYEIET 278 

AAC AAG ACA GAA CCA GTC ATC AAT GGC ACA AGT GTT GTT CTA CAG TAT GAA ATT GAA ACG 834 
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D P A L T Y V 
GAT CCT GCC TTG ACG TAT GTA 

V R I V F S P 
GTC CGT ATT GTT TTT TCA CCC 

D F V A I L P 
GAC TTT GTG GCC ATC CTA CCT 

A A K D V L G 
GCT GCT AAA GAT GTG CTT GGC 

F K Is T R H F 
TTC AAG CTC ACC CGC CAT TTT 

T N E F L L L 
ACT AAT GAA TTT TTG CTG CTG 

M I Y Y A E R 
ATG ATC TAC TAT GCC GAG AGA 

T Q F K N I P 
ACA CAG TTC AAA AAC ATT CCC 

G Y G D M Y P 
GGT TAT GGG GAT ATG TAC CCC 

L A G V L T I 
CTG GCT GGA GTG CTG ACA ATA 

Y Y S L A M A 
TAC TAC TCC TTG GCA ATG GCA 

P A P Q A S S 
CCT GCT CCT CAG GCA AGC TCA 

S T Q S D T C 
AGT ACA CAG AGT GAC ACA TGT 

V L S G D D S 
GTG TTA TCA GGT GAC GAC AGT 

P I R R S S T 
CCC ATC AGA CGC TCT AGT ACC 

T T G D Y T C 
ACG ACA GGT GAT TAC ACG TGT 

R S L N N I A 
CGA AGC TTA AAC AAC ATA GCG 

S P Y N S P C 
TCA CCC TAC AAC TCT CCT TGT 

* 

TAA 



E G V C V V W 
GAA GGA GTG TGT GTG GTG TGG 

N K L E F I K 
AAT AAA CTT GAA TTC ATC AAA 

F Y L E V G L 
TTC TAC TTA GAG GTG GGA CTC 

F L R V V R F 
TTC CTC AGG GTG GTA AGG TTT 

V G L R V h G 
GTA GGT CTG AGG GTG CTT GGA 

I I F L A L G 
ATA ATT TTC CTG GCT CTA GGA 

V G A Q P N D 
GTG GGA GCT CAA CCT AAC GAC 

I G F W W A V 
ATT GGG TTC TGG TGG GCT GTA 

Q T W S G M L 
CAA ACA TGG TCA GGC ATG CTG 

A M P V P V I 
GCC ATG CCA GTG CCT GTC ATT 

K Q K L P R K 
AAG CAG AAA CTT CCA AGG AAA 

P T F C K T E 
CCT ACT TTT TGC AAG ACA GAA 

L G K D N R L 
CTG GGC AAA GAC AAT CGA CTT 

T G S E P P L 
ACA GGA AGT GAG CCG CCA CTA 

R D K N R R G 
AGA GAC AAA AAC AGA AGA GGG 

A S D G G I R 
GCT TCT GAT GGA GGG ATC AGG 

G L A G N A L 
GGC TTG GCA GGC AAT GCT CTG 

P L R R S R S 
CCT CTG AGG CGC TCT CGA TCT 



F T F E F L 298 

TTT ACT TTT GAA TTT TTA 894 

N L L N I I 318 

AAT CTC TTG AAT ATC ATT 954 

S G L S S K 338 

AGT GGG CTG TCA TCC AAA 1014 

V R I L R i 358 
GTG AGG ATC CTG AGA ATT 1074 

H T L R A S 378 

CAT ACT CTT CGA GCT AGT 1134 

V L I F A T 398 
GTT TTG ATA TTT GCT ACC 1194 

P S A S E H 418 

CCT TCA GCT AGT GAG CAC 12 54 

V T M T T L 438 
GTG ACC ATG ACT ACC CTG 1314 

V G A L C A 458 
GTG GGA GCC CTG TGT GCT 1374 

V N N F G M 478 
GTC AAT AAT TTT GGA ATG 1434 

R K K H I P 498 

AGA AAG AAG CAC ATC CCT 1494 

L N M A C N 518 

TTA AAT ATG GCC TGC AAT 1554 

L E H N R S 538 

CTG GAA CAT AAC AGA TCA 1614 

S P P E R L 558 

TCA CCC CCA GAA AGG CTC 1674 

E T C F L L 578 

GAA ACA TGT TTC CTA CTG 1734 

K G Y E K S 598 

AAA GGA TAT GAA AAA TCC 1794 

R L S P V T 618 

AGG CTC TCT CCA GTA ACA 1854 

P I P S I L 638 

CCC ATC CCA TCT ATC TTG 1914 

639 
1917 
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ACCAAACAACCAAACTGCATCAGTCGGCTAAATTGTATTAATTCAAGYGCTGTTTACCCCATAATGGAAATAATTAAAT 

GTAGAGTTACTCCAGGCTCCATTAATACAGTATAAATCTTGCGTGATACTACAATTTGAAGTCAGAAATGCCACTTGGG 

TAGCTAATGAATCTTACCCAGGCTTTAAAGATTGTCTAAAGTAGTGCTAAGATCCCTCCTATTAATTGCCCTGATATCC 
TTTTGCAATAAAATGACAGATAGTGTCAGATAT 

AAAACAGTGTGCTTCCAAATGCCAACCACTTCATTGGAACTTTATTTCTTGTGA 
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Protein Family / Domain Matches, HMMer version 2 



Searching for complete domains in PFAM 

hmmpfam - search a single seq against HMM database 

HMMER 2.1.1 (Dec 1998) 

Copyright (C) 1992-1998 Washington University School of Medicine 

HMMER is freely distributed under the GNU General Public License (GPL) . 



HMM file: 
Sequence file: 



/prod/ddm/seqanal/PFAM/pf am6 . 4/Pfam 
/prod/ddm/wspace/orf anal /oa- script .4688. seq 



Query : 



Fbh53 763 



Scores for sequence family classification (score includes all domains) : 
Model Description 



K+ channel tetramerisation domain 
Ion transport protein 



Score 
156 .7 
116.9 



E-value N 
4e-43 
3.9e-31 



K_tetra 
1 

ion_trans 
1 

oxidored_q3 NADH-ubiquinone/plastoquinone oxidoreduct -81.7 5.6 
1 



Parsed for domains: 

Model Domain seq-f seq-t 



hmm-f hmm-t 



K_tetra l/l 8 

oxidored_q3 l/l 317 
ion trans l/l 281 



156 
467 
472 



111 [] 
177 [] 
223 [] 



score E-value 

156.7 4e-43 

-81.7 5.6 

116.9 3.9e-31 



Alignments of top-scoring domains: 

K_tetra: domain 1 of 1, from 8 to 156: score 156.7, E = 4e-43 

* - >ErvrLNVGGkrFeTsksTLtrf kpdTlLgrllktdsd 

Erv+LNVGG+r+eT++sTL ++ p T+L 1++ S+++++ ++ + 
8 ERVILNVGGTRHETYRSTLKTL- PGTRLALLAS- -SEppgdclttag 51 



Fbh53763 



Fbh53763 



Fbh53763 



Fbh53763 



vhearlrlcd 

++ ++++++ + + + + + ++ ++++++ +++ ++ +++++++ d 
52 dklqpsppplsppprapplspgpggcfeggagncssrggrA 



SD 94 



fyddetgEyFFDRsPkhFetlLnfYRtGdGkLhrp.evcldsf leEleFy 
+ E+FFDR+P++F ++Ln+YRt GkLh+p +vc f+eEl+F+ 
95 HPGGGR-EFFFDRHPGVFAYVLNYYRT- -GKLHCPaDVCGPLFEEELAFW 141 



gldelaiesCcedeY<- 
g+de ++e+Cc+++Y 
14 2 GIDETDVEPCCWMTY 



156 



oxidored_q3: domain 1 of 1, from 317 to 467: score -81.7, E = 5.6 

*->mtyivliLsillvlGflgVaskpsPiYgaLgLivaggvGCGlvlslG 

+ +v iL + 1 +G+ g++sk + +++ +v ++++++ 1 
317 I IDFVAI LPFYLEVGLSGLSSKAAKDVLGFLRWRFVRI - LRIFKLT 362 



Fbh53763 



Fbh53763 



Fbh53763 



Fbh53763 



gsFvalvlFLIYLGGMlWFgYtvalateeyPEaWgsnkwwtigdgval 
Fv+ 1 ++ g t t e+ + ++i ++1 
363 RHFVGLRVL GHTLRASTNEF L LLI IFL 3 89 

vigil ievlLvglvl gwtewiwaltglGdwviYdvegsg 

lg+li++ ++++ ++ + ++++++ e + +++ G w + v+ + 
390 ALGVLIFATMIYYAErvgaqpndpSASEHTQFKNIP- IGFW- -WAWTM- 435 

1 iredl sGvaaLYscgvwmf evaGwvLLval f wieltR< - * 
++ G +Y +w + G L al +V+++++ 
436 TTLGYGDMYPQ - TWSGMLVG - - ALCALAGVLT I AM 467 
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ion trans: 



Fbh53 763 



domain 1 of 1, from 281 to 472: score 116.9, E = 3.9e-31 

*->ilf ildllfvllf lleivlkf iayglkstsniaakylksifnildll 
1 + + + + ++v ++f ++e++++++ +++k ++k+ ni+d+ 

281 ALTYVEGVCWWFTFEFLVR I VFS PNK LEFIKNLLNIIDFV 321 



Fbh53 763 



Fbh53 763 



ailplllllvlf lsgteqvakkrlrer. f . slelsq . wyyrilrf IrlLr 
ailp++l ++1 ++++s ++ + +flr++r 
3 22 AILPFYLEVGL SgLsSKAAKDvL GFLRWR 351 

HRlLrllrllrrletlfefelgtlaWslqslgralksilrf 11111111 
++R +lr++ +++ +++ 1+ lg++l++ ++ +111+++1 
352 FVR ILRIFKLTR HFVG LRVLGHTLRASTNEFLLL 1 1 FL 389 

igfsvigyllfkgyedlse. . . . nevdgnsef ssyf daf yf If vtlttvG 
+ + i + + + + + e+ ++ + + ++ + + f +++ +f ++++vt+tt+G 
Fbh53763 390 ALGVL I FATMI YYAERVGAqpndPSASEHTQFKNI P I GFWWAWTMTTLG 439 

f Gdl vpv . wig i i f f vl f f i i vgl 1 1 lnl 1 iavi < - * 
+Gd++p +w+g++++ +++++-i-g+l + + + + + + +vi 
Fbh53763 440 YGDMYPQtWSGMLVG-ALCALAGVLTIAMPVPVI 472 



// 

Searching for complete domains in SMART 

hmmpfam - search a single seq against HMM database 

HMMER 2.1.1 (Dec 1998) 

Copyright (C) 1992-1998 Washington University School of Medicine 
HMMER is freely distributed under the GNU General Public License (GPL) 



HMM file: 
Sequence file: 



/ddm/robison/smart/smart/smart .all .hmms 
/prod/ddm/wspace/orfanal/oa-script .4688 . seq 



Query: Fbh53 76 3 

Scores for sequence family classification (score includes all domains) : 
Model Description Score E-value N 



BTB 4 



72 .7 



7 . 8e-18 



Parsed for domains: 

Model Domain seq-f seq-t hmm-f hmm-t 



BTB 4 



1/1 



159 



114 [] 



score E-value 
72.7 7.8e-18 



Alignments of top-scoring domains: 

BTB_4: domain 1 of 1, from 8 to 159: score 72.7, E = 7.8e-18 

*->cDvtlwggdlggdnaegkkfhasqHkavLaacrrdSpyFkalf es . 
+ +V+ 1 +V gg ++ + + +++L + + + + +al+ S+ 

Fbh53763 8 ERVILNVGG TRHET - - YRSTLKTL PGTRLALLAS s 40 



+ + + + + ++ +++ ++++++ +++++ + + ++++++ + + + ++ +++++ 
Fbh53763 41 eppgdcl ttagdklqpsppplsppprapplspgpggcf eggagncssrgg 90 

ieldDeallievspeaFralLnf lYt . kldlpeedvenve 

+ ++++++++++++D ++p +F +Ln+++t+kl++p + + + 

Fbh53763 91 rasdhpgggrEFFFD RHPGVFAYVLNYYRTgKLHCPAD- - VCGP 132 

elLelAdf IdSYGqip. lvelCeef llknl<- * 
+ e+ f+ G+ + + ve C+++ +++ 
Fbh53763 133 LFEEELAFW GIDEtDVEPCCWMTYRQH 159 



// 



FIGURE 18B 



CLU3TAL. W (1.74) multiple sequence alignment 



FbhS3763pat 
ratCIKE 



Fbh53763pat 
ratCIKE 



Fbh53 763pat 
ratCIKE 



Fbfe53763pat 
ratCIKE 



FbtL53763pat 
rafeCIKE 



Fbi|53763pat 
ratCIKE 



Fbl|53763pat 
ratCIKE 



Fbh53763pat 
ratCIKE 



Fbh53763pat 
ratCIKE 



Fbh53763pat 
ratCIKE 



Fbh53 763pat 
ratCIKE 



MGKIENNERVILNVGGTRHETYRSTLKTLPGTRLALLASSEPPGDCLTTAGDKLQPSPPP 
MGKI ENNERVI LNVGGTRHETYRSTLKTLPGTRLALLASSEPQGDCLTAAGDKLQPLPPP 

*** * * ************************************ * *****;******* *** 

LSPPPRAPPLSPGPGGCFEGGAGNCSSRGGRASDHPGGGREFFFDRHPGVFAYVLNYYRT 
LSPPPRPPPLSPVPSGCFEGGAGNCSSHGGNGSDHPGGGREFFFDRHPGVFAYVLNYYRT 
****** ***** * ************;**..**************************** 

GKLHCPADVCGPLFEEEIiAFWGIDETDVEPCCWMTYRQHRDAEEALDIFETPDLIGGDPG 
GKLHCPADVCGPLFEEELAFWGIDETDVEPCCWMTYRQHRDAEEALDIFETPDLIGGDPG 
******.****************************************************** 

DDEDLAAKRLG I EDAAGLGG PDGKSGRWRRLQ PRMWALFED P YS S RAAPJFI AFAS LF F I L 
DDEDLGGKRLGIEDAAGLGGPDGKSGRWRKI.QPRMWALFEDPYSSRAARfTMMIiEOL 

***** **********************;****************************** 



VSITTFCliETHEAFNIVKNKTEPVINGTSWLQYE I ETDPALTYVEJ3VCWWFTFEFLVR 
VS ITTFCL STHEAFNI VKNKTEPVINGTSAVLQYE I ETDPALTYVEyVCVVWF^F^FLVR 
*******^*********^***^****** . ****************************** 
TrvS , ___I£2^ 

TVFte PNKLEF I KNfcjLiNI I DFVAI LPF YLEVGL^BGLS S KAAKDVLG|PLRWRFVRI LR I FK 



t trr^PMTCT ,EF I ^ ,T .NT IDFVAILPFYLEVGLB GLSSKAAKDVL GFLRWRFVRI LRIFK 

k * * * * i 



1T*~*~* ****** ******* ********** ********************** **^*********^ 



LTRHFVGliRVLGHTLRASTNE*FLLLI I FLALGVLI FATMI YYkERVGAQPNDPSASEHTQ 
LTRHFVGL p\tt r,HTT J? ASTKT EjFLLLI I FLALGVLI FATMI Ym ERVGAQPNDPSASEHTQ 
******** *************************** ************************* 



pFgfwwavvt 

FKNIP IGFWWAWTMTT^ 

**** V* ****** ********** ****** ******************************** 



SLAMAKQKLPRKRK3<HIPPAPQASSPTFCKTELNMACNSTQSDTCLG 
SLAMAKQKLPRKRKKHIPPAPLASSPTFCKT^ 

********************* **************************:*********** 

SGDDSTGSEPPLSPPERLPIRRSSTRDKNRRGETCFLLTTGDYTCASDGGIRKGYEKSRS 
SGDDSTGSEPPLSPPERLPIRRSSTRDKNRRGETCFLLTTGDYTCASDGGIRKGYEKSRS 

************************************************************ 

LNNIAGLAGNALRLSPVTSPYNSPCPLRRSRSPIPSIL 
LNNIAGLAGNALRLSPVTSPYNSPCPLRRSRSPIPSIL 
************************************** 
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Input file Fbh67076FL.seq; Output File Fbh67076FL . tra 
Sequence length 6582 



CCACGCGTCCGCCCACGCGTCCGCCCACGCGTCCGAGAAGGCTTAGGTGGGCAGGCAGGACGAGAGAAAGACTGAGAGG 

AGGGAAAGCCGCGTAGGTGGGAGTACAGCGGCGCGAGGGTCGAGGGGGAACCCTCGTCGGTGCAGATGAGGAGGGTGGG 

CTTTCAGAACTAGTCCCCCCTCGCACCCCGCCCCGCCCCTCCCGCGCTGGGGTCTTCACGGTGCCCTGCCTCAGAGCCC 

GGCTCCACCACGCCCGGAAGAGGGAGTCTGGCCGTCGGCTGGCTCAGGGCGGGCCGGTTGGCTGTACCCAGGCTCCCTG 

GCCCGAGTGCGGGACCAGAGCGCGGGGCGGCGCGGCAGCCGCGGGCCGAGGAGGGGCTGCGAGCGAAACGGCGCGGCGC 

GGCACGGCGGACGAGTTAGGGCCGGGGCGAGGGAGGCTGTGGCTCCCGACAGAGACAGGGGAGTAGTGTCGGGCTGAGG 

D MFRRSLN7 

O CGAGACAGCCCGGTAGAGCCCAGCTCAGCGCCCGGCAGCCTTCGACGCG ATG TTC CGC CGG AGC TTG AAT 21 

"J- RFCAGEEKRVGTRTVFVGNH 27 

i_ CGT TTT TGT GCT GGA GAA GAG AAA CGA GTT GGC ACA CGC ACA GTG TTT GTT GGC AAT CAT 81 

i; y PVSETEAY IAQRFCDNRIVS 47 

UJ CCA GTT TCG GAA ACA GAA GCT TAC ATT GCA CAA AGA TTT TGT GAT AAT AGA ATA GTC TCA 141 

t£ SKYTLWNFLPKNLFEQFRRI 67 

fsj TCT AAG TAT ACA CTT TGG AAT TTT CTC CCA AAG AAT CTG TTT GAA CAG TTT AGA AGA ATT 201 

H ANFYFLI I FLVQVTVDTPTS 87 

^ GCA AAT TTT TAT TTT CTC ATA ATC TTC CTT GTA CAG GTC ACA GTA GAC ACA CCA ACT AGC 261 

M= PVTSGLPLFFVITVTAIKQG 107 

CCA GTT ACC AGT GGA CTT CCA CTT TTC TTT GTT ATA ACT GTT ACA GCC ATC AAG CAG GGA 321 

YEDWLRHRADNEVNKSTVYI 127 

TAT GAG GAT TGG CTG AGA CAC AGA GCT GAC AAT GAA GTC AAC AAA AGC ACT GTT TAC ATT 381 

IENAKRVRKESEKIKVGDVV 147 

ATT GAA AAT GCA AAG CGA GTG AGA AAA GAA AGT GAA AAA ATC AAG GTT GGT GAT GTA GTA 441 

EVQADETFPCDLILLSSCTT 167 

GAA GTA CAG GCA GAT GAA ACC TTT CCC TGT GAT CTT ATT CTT CTA TCA TCT TGC ACC ACT 501 

Dg'tCYVTTAS LDG ESNCKTH 187 

GAT GGA ACC TGT TAT GTC ACT ACA GCC AGT CTT GAT GGG GAA TCC AAT TGC AAG ACA CAT 5 61 

YAVRDTIALCTAESIDTLRA 207 

TAT GCA GTA CGT GAT ACC ATT GCA CTG TGT ACA GCA GAA TCC ATC GAT ACC CTC CGA GCA 621 

AIECEQPQPDLYKFVGRINI 227 

GCA ATT GAA TGT GAA CAG CCT CAA CCT GAC CTC TAC AAA TTT GTT GGG CGA ATC AAT ATC 681 

YSNSLEAVARSLGPENLLLK 247 

TAC AGT AAT AGT CTT GAG GCT GTT GCC AGG TCT TTG GGA CCT GAA AAT CTC TTG CTG AAA 741 

GATLKNTEK I YGVAVYTGME 267 

GGA GCT ACG CTA AAA AAT ACC GAG AAG ATA TAT GGA GTT GCT GTT TAC ACT GGA ATG GAA 801 
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TKMALNYQGK 
ACC AAA ATG GCT TTG AAC TAC CAA GGG AAA 

INAFLIVYLF 
ATT AAT GCT TTC CTG ATT GTA TAT TTA TTT 

TLKYVWQSTP 
ACT CTA AAG TAT GTT TGG CAA AGT ACC CCA 

TQKERETLKV 
ACT CAG AAA GAG CGA GAG ACC TTG AAG GTT 

MVLFNFIIPV 
ATG GTT CTA TTC AAC TTT ATC ATT CCT GTC 

FLGSFFISWD 
TTC TTG GGC TCC TTC TTC ATC TCA TGG GAT 

GALVNTSDLN 
GGA GCC CTG GTT AAC ACA TCA GAC CTT AAT 

TDKTGTLTEN 
ACA GAT AAG ACT GGA ACA CTC ACT GAA AAC 

GHKYKGVTQE 
GGC CAC AAA TAT AAA GGT GTA ACT CAA GAG 

LTYFDKVDKN 
TTA ACA TAT TTT GAC AAA GTA GAT AAG AAT 

LCHTVEIKTN 
TTA TGT CAT ACT GTA GAA ATC AAA ACA AAC 

ELTYI'SSSPD 
GAA TTA ACC TAT ATC TCC TCT TCA CCA GAT 

YGFTFLGNRN 
TAC GGG TTC ACA TTT TTA GGA AAT CGA AAT 

EIEEYELLKT 
GAA ATA GAA GAA TAT GAA CTT CTT CAC ACC 

SVIVKTQEGD 
AGT GTA ATT GTG AAG ACT CAA GAA GGA GAC 

AVFPRVQNHE 
GCA GTT TTT CCC AGA GTG CAA AAT CAT GAA 

NAMDGYRTLC 
AAT GCA ATG GAT GGG TAT CGG ACA CTC TGT 

YERINRQLIE 
TAT GAA AGA ATT AAC AGA CAG CTC ATA GAG 

KMEKVFDDIE 
AAA ATG GAA AAA GTT TTC GAT GAT ATT GAG 



SQKRSAVEKS 287 

TCT CAG AAA CGT TCT GCT GTT GAA AAA TCT 861 

I LI/TKAAVCT 307 

ATC TTA CTG ACC AAA GCT GCA GTA TGC ACT 921 

YNDEPWYNQK 327 

TAC AAT GAT GAA CCT TGG TAT AAC CAA AAG 981 

LKMFTDFLSF 347 

TTA AAA ATG TTC ACC GAC TTC CTA TCA TTT 1041 

SMYVTVEMQK 367 

TCC ATG TAC GTC ACA GTA GAA ATG CAG AAA 1101 

KDFYDEEINE 387 

AAG GAC TTT TAT GAT GAA GAA ATT AAT GAA 1161 

EELGQVDYVF 407 

GAA GAA CTT GGT CAG GTG GAT TAT GTA TTT 1221 

SMEFIECCID 427 

AGC ATG GAA TTC ATT GAA TGC TGC ATA GAT 1281 

VDGLSQTDGT 447 

GTT GAT GGA TTA TCT CAA ACT GAT GGA ACT 1341 

REELFLRALC 467 

CGA GAA GAG CTG TTT CTA CGT GCC TTG TGT 1401 

DAVDGATESA 487 

GAT GCT GTT GAT GGA GCT ACA GAA TCA GCT 1461 

EIALVKGAKR 507 

GAA ATA GCT TTG GTG AAA GGA GCT AAA AGG 1521 

GYMRVENQRK 527 

GGA TAT ATG AGA GTA GAG AAC CAA AGA AAA 1581 

LNFDAVRRRM 547 

TTA AAC TTT GAT GCT GTC CGG CGA CGT ATG 1641 

ILLFCKGADS 567 

ATA CTT CTC TTT TGT AAA GGA GCA GAC TCG 1701 

I ELTKVHVER 587 

ATT GAG TTA ACT AAA GTC CAT GTG GAA CGT 1761 

VAFKEIAPDD 607 

GTA GCC TTC AAA GAA ATT GCT CCA GAT GAT 1821 

AKMALQDREE 627 

GCA AAA ATG GCC TTA CAA GAC AGA GAA GAA 1881 

TNMNLIGATA 647 

ACA AAC ATG AAT TTA ATT GGA GCC ACT GCA 1941 
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V E D K L Q D 
GTT GAA GAC AAG CTA CAA GAT 

L K V W V L T 
CTG AAA GTC TGG GTG CTC ACT 

C R L F Q T N 
TGC CGC CTT TTC CAG ACC AAC 

S E R K E D R 
AGT GAA AGG AAA GAA GAT CGA 

H E F P K S T 
CAT GAG TTT CCT AAA AGT ACT 

G L I I D G S 
GGA TTA ATC ATA GAT GGC TCC 

S N N Y K S I 
TCA AAC AAT TAG AAA AGC ATT 

C R M A P L Q 
TGT CGG ATG GCA CCA TTA CAG 

S P I T L S I 
AGC CCA ATA ACT CTG TCG ATA 

H V G I G I K 
CAT GTG GGA ATA GGT ATT AAA 

S V P K F K H 
TCT GTT CCA AAG TTT AAA CAC 

V R I A H h V 
GTG AGA ATA GCA CAC CTT GTA 

Q F L Y Q F F 
CAG TTT TTG TAC CAG TTC TTC 

L T M Y N I C 
CTT ACA ATG TAC AAT ATC TGC 

Q H I N I D T 
CAG CAC ATC AAC ATT GAC ACT 

N A M L Q L G 
AAT GCC ATG CTA CAG TTG GGC 

T V F F F G T 
ACA GTG TTC TTC TTT GGG ACT 

K V Y G N W T 
AAG GTA TAC GGA AAC TGG ACT 

T L K L A L D 
ACC CTG AAG CTT GCC TTG GAT 



Q A A E T I E 
CAA GCT GCA GAG ACC ATT GAA 

G D K M E T A 
GGG GAC AAG ATG GAG ACA GCT 

T E L L E L T 
ACT GAG CTC TTA GAA CTA ACC 

L H E L L I E 
TTA CAT GAA TTA TTG ATA GAA 

R S F K K A W 
AGA AGC TTT AAA AAA GCA TGG 

T L S L I L N 
ACA TTG TCA CTC ATA CTA AAT 

F L Q I C M K 
TTC CTA CAA ATA TGT ATG AAG 

K A Q I V R M 
AAA GCC CAG ATT GTC AGA ATG 

G D G A N D V 
GGT GAT GGT GCC AAT GAT GTT 

G K E G R Q A 
GGC AAA GAA GGT CGC CAA GCA 

L K K L L L A 
TTA AAG AAA CTG CTG TTG GCT 

Q Y F F Y K N 
CAG TAC TTC TTC TAT AAG AAC 

C G F S Q Q P 
TGT GGA TTC TCA CAA CAG CCA 

F T S L .P I L 
TTC ACA TCC TTG CCC ATC CTG 

L T S D P R L 
CTG ACC TCA GAT CCC CGA TTG 

P F L Y W T F 
CCC TTC TTA TAT TGG ACA TTT 

Y F L F Q T A 
TAC TTT CTT TTT CAG ACT GCA 

F G T I V F T 
TTT GGA ACC ATT GTT TTT ACA 

T R F W T W I 
ACC CGA TTC TGG ACG TGG ATA 



A h H A A G 667 

GCT CTG CAT GCA GCA GGC 2001 

K S T C Y A 687 

AAA TCC ACA TGC TAT GCC 2061 

T K T I E E 707 

ACA AAA ACC ATT GAA GAA 2121 

Y R K K L L 727 
TAT CGC AAG AAA TTG CTG 2181 

T E H Q E Y 747 

ACA GAA CAT CAG GAA TAT 2 2 41 

S S Q D S S 767 

TCT AGT CAA GAC TCT AGT 23 01 

C T A V L C 787 

TGT ACT GCA GTG CTC TGC 2361 

V K N Li K G 807 
GTG AAG AAT TTA AAA GGC 2421 

SMILES 827 

AGT ATG ATC TTG GAA TCC 2481 

A R N S D Y 847 

GCT AGG AAT AGC GAT TAT 2541 

H G H L Y Y 867 

CAT GGA CAT CTA TAT TAT 2601 

L C F I L P 887 

CTT TGT TTC ATT TTG CCA 2661 

L Y D A A Y 907 

CTG TAT GAT GCT GCT TAC 2721 

A Y S L L E 927 

GCC TAT AGT CTA CTG GAA 2781 

Y M K I S G 947 
TAT ATG AAA ATT TCT GGC 2841 

L A A F E G 967 

CTG GCT GCC TTT GAA GGG 2901 

S L E E N G 987 

TCC CTA GAA GAA AAT GGA 2961 

V L V F T V 1007 
GTC TTA GTA TTC ACT GTA 3021 

N H F V I W 1027 

AAT CAC TTT GTG ATT TGG 3 081 
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GSLAFYVFFS FFWGGI IWPF 1047 
GGT TCT TTA GCC TTC TAT GTA TTT TTC TCA TTC TTC TGG GGA GGA ATT ATT TGG CCT TTT 3141 



LKQQRMYFVFAQMLSSVSTW 1067 
CTC AAG CAA CAG AGA ATG TAT TTT GTA TTT GCC CAA ATG CTG TCT TCT GTA TCC ACA TGG 3201 

LAIILLIFISLFPEILLIVL 1087 
TTG GCT ATA ATT CTT CTA ATA TTT ATC AGC CTG TTC CCT GAG ATT CTT CTG ATA GTA TTA 3261 

KNVRRRSARRNLSCRRASDS 1107 
AAG AAT GTA AGA AGA AGA AGT GCC AGG AGA AAT CTG AGC TGT AGA AGG GCA TCT GAC TCA 3 321 

LSARPSVRPLLLRTFSDESN 1127 
TTA TCC GCC AGA CCT TCA GTC AGA CCT CTT CTT TTA CGA ACA TTC TCA GAC GAA TCT AAT 3 381 

V L 1130 
GTA TTG TAA 3390 

CAGAATCCGAATCTTGAACTGCCTATGTTATTGTCCTACAAGCATAC 

AAGAAACAACTACAAAAAGTTATCATCTCAGGATACTTGATATGCAA 

AATAAATGTTCATTAAAATACCAAATGATTCTCTTAAGCATTTAC 

AAGTTAAGAATTATATGAAAGTTGAAAGCAAGAATACTTAGAATTC 

TGCTCTTTTAACCCATGAACTTTGTGAATGGATTTAAATACAATAGTA 

GATTTTGC1 




AATAGTTAATCCCTTCTGTTTACCCATGTGCTACTAATGTCTTGG 

TTATGTGGAAAGTGTTAACTTACGGGTATTTTTGTGGGAATAGAAAAAAATTC 

CCCCACTTATGGGTGTAAGCCTACTAGACTTGAAAATAAAGTATAAA^ 

AGTTAGAAAATAAACAGATTTTTCCAGTGTTGATTTTACTGGGATC 

AAATAAAGGTCATTCTGAATATCAGCCTTTTATAATTTTATC 

CGGTTTTTATTTGAAAGAGATTGCATTTATGCAW 

AGGAGCCACCCCAAAACGGTGGTTCAGCTTGTAGAGCCATGACTCTGTGAAGATGAATGTTGTCTC 
GGAAATGGTCTAACTCTAAACCATGTAACTGACCTTAGTAAAGTCCTTGACTAACTGAACTA 
TCTAATTAGTTCACTTGAAACATAAATGTGAAATGTCTTCATTCAATGTTA 
CATATTTATTTGACTGCTAGTTTTTTTGTTTTTT^ 

TACCTTGATTTGGAAAAGTATTGGAGTTAATCTGTATTATATTTATATAGTCCAT 

ATATTTTGTGTTAATGTTTAGGTATGATTTTTTTCTA 

CATCATTATAGACCCTTTTTCATTATTTCATTTGCTCTCATATATC 
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in 



m 1 lipid "i" 



"Mifc m »>» >> m 



TCACAACTTACCTAAGTGTGCTGTGTTCTCX3TAG 

TTGCCTATTCCAAAGAGCTAAAAAAGTCTAACCCAGGAAAGCTTTTGATATT^ 

TGTTGTTGCTGTATTATGATTGCTGTTTTAC^ 

AGGCTTGTTTAATGCAGTACCATTGGAGAGTTAACAGAATAATC 

TCCAGCCAGAAAGAAAGAAAGACAAGGAGTAAGGGGGATTTAGAGTTATGTCTCAGCTACACATTACATTGTG 

CAGCTCAAATTCAGAATGGCAATGATACATGATATCATGGCCTAGATCCTTGAGAGGGACCTGGCTTTCCTTTTTAAAA 

GATATTTTACTGAAGAGCTAAAAACTGGCCAGTGTGGGGTTAGCAGATCGAATAACTTGAAATAGACCGTC 

CTAGCACTCAATGTAATCACCCTATTTGTGACA 

TAATTTTGAGCTATCAAAATGTCTTTGTAATTTTCACAA 

AACATTCATTCCATATCTACTTACACATACACCAGCAA 

TTAGTGATGGAATTTTTTAATAACATGCAGTATATAAATGTGCAGATTTTATC 

TGCAAAATGGGACTGCAATATTACATTTTTC^^ 

GAATGCCATCTTTTATGACTGCAACTTGCCTTTTCCATT^^ 

GATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA^ 
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Protein Family / Domain Matches, HMMer version 2 
Searching for complete domains in PFAM 
hmmpfam - search a single seq against HMM database 
HMMER 2.1.1 (Dec 1998) 

Copyright (C) 1992-1998 Washington University School of Medicine 

HMMER is freely distributed under the GNU General Public License (GPL) . 

HMM file: /prod/ddm/seqanal/PFAM/pfam6.4/Pfam 
Sequence file: /prod/ddm/wspace/orf anal /oa- script . 13 758 . seq 

Query: 67076 

Scores for sequence family classification (score includes all domains) : 
Model Description Score E-value N 



Hydrolase haloacid dehalogenase- 1 ike hydrolase 12.7 0.019 

1 

Parsed for domains: 

Model Domain seq-f seq-t hmm-f hmm-t score E-value 



Hydrolase l/l 403 837 . . 1 184 [] 12.7 0.019 

Alignments of top-scoring domains: 

Hydrolase: domain 1 of 1, from 403 to 837: score 12.7, E = 0.019 

* - >ikavvFDkDGTLtdgkeppiaeaivealrelgl 

++ v+ Dk+GTLt + + e +++ + +g++++ ++ ++++++ 
67076 403 VDYVFTDKTGTLTEN - SMEF I ECC I DGHKYKGVt qe vdgl sqt dg 1 1 448 

. . . . apleevekllgrgl . gerilleggltaell 

+ + + + el +r+l ++++ t +++++ +++ + + ++++ 

67076 449 tyf dKVDKNREELFLRALcL- -CHTVEIKTNDAVdgatesael tyisssp 496 



+ + + + ++ + + +++++ + ++ +++ ++ + + + + + +++ 
67076 4 97 deialvkgakrygf t f Ignrngymrvenqrkeieeyellhtlnf davrrr 546 



+ + + + + + ++ ++ ++ +++ + ++ + +++ ++ ++ 

67076 547 msvivktqegdil If ckgadsavf prvqnheieltkvhvernamdgyrtl 596 

Id . evlgli 

++ +++ ++ ++ + + +++++ ++ ++ +++ ++g+ 
67076 597 c v a f k e i ap ddy e r i nr ql i e akma 1 qd r e e kme k v f d d i e TNmNL I GAT 646 

al . dklypgarealkaLkerGikvailTngdr . naealle algla 

a++dkl + a+e+++aL+++G+kv++lT++ ++a+ + +++ 

67076 647 AVeDKLQDQAAET I EALHAAGLKVW VLTGDKMe TAKS TCYa c r 1 f QTNTE 696 

. If daivdsdevggvgpvwgKPkpeif llalerlgvkpeevg 

]_ +++e + k + +++ + 1 + +++k ++ +++ + + 

67076 697 1LELTTKTIEESE RKEDRLHELLIEYRKKLL- -Hef pkstr 73 5 



+ ++ +++++ + ++++ + ++++++++++ ++ + + + 
67076 736 sf kkawtehqeygl iidgstlslilnssqdsssnnyksif Iqicmkctav 785 

p . kvlmvGDginDapalaaAGvgvamgn 

+ + ++ + + ++ ++++++1 +GDg nD+ ++ +vg+ + 
67076 786 1 c c rmap 1 qkaqi vrmvknl kgSp I TLS IGDGANDVSMI LESHVG I GI KG 835 

gg<-* 

67076 836 KE 837 
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!LUSTAIi W (1.74) multiple sequence alignment 



T>h67076FL 
fouseATlH 



Trfi67076FL 
louseATlH 



MFRRSLNRFCAGEEKRVGTRTVFVGN-HPVSETEAYIAQRFCDNRIVSSKYTLWNF 

MDCSLLRTLVRRYCAGEENWVDSRTIYVGHKEPPPGAEAYIPQRYPDNRIVSSKYTFWNF 

. *.***★*. ★ .*★..**, * .*★** ★* . **********.*★* 
• • • * • • 

liPKNLFEQFRRIANFYFLlr I FLVQVTVDjTPTSPVTSGLPLFFVITVTAIK 3GYEDWLRHR 
I jPKNLFEQFRRIANFYFLt l I FLVQL 1 1 DfrPTSPVTSGLPLFFVITVTAIKP GYEDWLRHK 



. .******************************** . 



*bh67076FL 
touseATlH 



?bh67076FL 
tou^eATlH 



?bhi6|7076FL 
4ouieATlH 



FbH$J076FLt 
wfouieATlH 



FbHS7076FL 
MoiMeATlH 



ADNEVNKSTVY 1 1 ENAKRVRKES EKI KVGD WE VQADET F P CDL I LLS SCTTDGTCYV' 
ADNAMNQCPVHFIQHGKLVRKQSRKLRVGDIVMVKEDETFPCDLIFLSSNRADGTCHV' 
* * * ***«**..***.* * ; *********-*** 



3* 



ASEbGESNCKTHYAVRDTIALCTAESIDTLRAAIECEQPQPDLYKFVGRINIYSNSLEAV 
ASLDGES SHKTHYAVQDTKGFHTEADVDSLHAT I ECEQPQPDLYKFVGRI NVYNDLNDPV 



ARSLGPENLLLKGATLKNTEKIYGVAVYTGMETKMALNYQGKSQKRSAVEKSINAI^IVY 
VRPLGSENLLLRGATLKNTEKIFGVAIYTGMETKMAI^ 

★ ** *****.**********.***-**★********** ***********-*.***** 
• ••• • * * 

-jry 



LFI LLTKAAVCTTLKYVW 2STP YNDEPWYNQKTQKERETLKVLKMFTDF LSFMVLFNFII 
LC I LVS KALINTVLKYVVff Ss EPFRDE PWYNEKTESERQRNLFLRAFTDF|LAFMVXiFNYI I 

★ **..** . ******** *.********* : % ** : ***** : ****** ; ** 



PVSMYVTVEl^KFLGSFFISWDKDFYDEEINEGALWTSDIJ^ELGQW 
PVSMYVTVEMD KFLGSYFITWDEDMFDEEMGEGPLVNTSDLNEELGQVEY _ 
****************.**.**-*-.***- ** **************.*-********* 




KTGTLT) 
t)KTGTLTi 



Fbft€7076FL 
MouseATlH 



ENSME F I ECC I DGHKYKG VTQE VDGLS QTDGTLT Y FDKVD KNREELFLRALCLCHT 

ENNMAFKECCIEGHVYVPHVICNGQVLPDSSGIDMIDSSPGVCGREREELFFRAICLCHT 

************ • ; * . 



* * * * * . * * . ***** 



Fbh67076FL 
MouseATlH 



VE I KTN DAVDGATES AELTY I S S S PDE I ALVKGAKRYGFTFLGNRNG YMR VENQ 

VQVKDDHCGDD VDG PQKS PDAKS CVY ISSS PDE VALVEGVQRLGFTYLRLKDNYME I LNR 
* . . * . * * * * . * 



****★★**.**★.* .* ***.* 



Fbh67076FL 
MouseATlH 



RKEIEEYELLHTLtNFDAVRRRMSVIVKTQEGDILLFCKGADSAVFPRVQNHEIELTKVHV 

END I ERFELLEVLTFDS VRRRMS VI VKSTTGE I YLFCKGADS S I F PR VI EGKVDQ VRS R V 
..** .*** * **.**********- * - * ********..**** • ... . .* 



Fbh67076FL, 
MouseATlH 



ERNAMDGYRTLCVAFKE I APDD YER I NRQL I EAKMALQDREEKMEKVFDD I ETNMNL I G A 
ERNAVEGLRTLCVAYKRLEPEQYEDACRLLQSAKVALQDREKXLAEAYEQIEKDLVLLGA 
****..* ****** . * . ★.-** * * **.******.★. * : ** 



Fbh67076FI, 
MouseATlH 



TAVEDKljQDQAAETIEALHAAGLKVWVLTGDKMETAKSTCYACRLFQTNTELLELTTKTI 

TA VEDRLQE KAADT I EALQKAG I KVWVLTGDKMETASATCYACKLFRRSTQLLE LTTKKL 
*****.**..*★.**★**. **.************* .**★**-**. *.******* 



Fbh67076FL 
MouseATlH 



EESERKEDRLHELLIEYRKJO^LHEFPKSTR-SFKKAWTEHQEYGLIIDGSTLSLII^SSQ 
EEQS LHDVLFDLSKTVLRCSGSMTRDSFSGLSTDMHDYGLIIDGAALSLIMKPRE 



* * - . * 



*******. . * * * * 



Fbh67076FL. 
MouseATlH 



Fbh67076FL 



D - SSSNNYKS I FLQI CMKCTAVLCCRMAPLQKAQI VRMVKNLKGS P I TLSlpDG ANDVSImI 
DGSSSGNYRELFLEICRNCSAVLCCRMAPLQKAQIVKLIK^ 

* *** **. .**.** .*.****************...* * **★*. l*J* ****** **2; 

• •••• • • ■ — • **i 

j „ Pko-j/>N»V Tr-^f^r — ===== ^r^S 

{TpESHVGIGI KGKEGRQAARNS DY S VP KF KHLKKLLLAHGHLYYVR I AH^VQYFFYKNLC 
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' HP 



louseATlH 



*bh67076FL 
louseATlH 



Ttfi67076FL 
louseATlH 



Tt>h67076FL 
louseATlH 



^bh67076FL 
loukfeATlH 



I LEAHVG I GVIGKEGRQAARNSDYAI PKFKHLKKMLLVHGHFYYI R I SE jLVQYFFYKNVC 
*** : ***** ; *************. .********. ***** •**•**. .*★****★**.* 



FILPQFLYQFFCGFS^PLYI^YLTMYNICFTSLPILAYSLIJBQHINIDTLTSDPRLYM 
FIFPQFLYQFl FCGFSQQTLYD mYLTLYNISFTSLPILLYSlK EQHVGIDVLKRDPTLYR 
**.************** m *** - **** : ***_ ******* ***.***. m ** ^ * a ** ** 



KISGNAMLQLG^FLYWTFLAAFEGTVFFFGTYF^^ 

^ENTTVTINGQMF3m£S33 



D I AKNALLRWR\f F I YWTF LGVFDA IiVFF FGAYF IfFENTTVTI NGQMF QflWTFGTT ,VFTVM 

**...***★***.****. 



*. **.*. ******* *. *****.**.* 

VFTAm^CLA^TRFWT^ IWPFLKQQRMYFVFAqRlSS 
VLTVTLKLALDfr HYWT wtlNHFVIWGSLLFYIAFSLLW WPFLS YQRMY Y VF I S^LSS 
*.**********..************* **. ** . : ***.***** m ****.** **★* 

TfM* 

VSTWLAIILLIFISLFPEILLIVLlK NVRRRSARRNLSCRRASDSLSAR 

GPAWLGIILLVTVGLLPDVIJtK KVLCRQLWPTATERTQNIQHQDSISEFTPLASLPSWGAQ 
.******..*.*.** ★* ... . * *. 



*bhiy7076FL 
40U#3AT1H 



PSVRPLLLRTFSDESNVL 

GSRLLAAQCSSPSGRWCSRWESEECPVLPLHPGLPHKARYGCCRSSLEMPT 
***. * *.* ** 



FIGURE 23B 



Input file Fbh67102FL.seq; Output File Fbh67102FL. tra 
Sequence length 6074 

CCACGCGTCCGGGAGGAGCGGAGGGAGAAGTAGGTTGCGAGCTC 
1TGGGAGATGTCTAAGTGATTTTTTTTTTTTC 

CGAATTTGTGCTTAGCTCTTTTCTTGTACCTTGCGACTCGTGACCAACATGCTGTGATGTGTGCCGAGGGA 

MTEALQWARY 10 

TCAGCTACACAACCTGGATCTTACCACAGTTTGGAT ATG ACT GAG GCT CTC CAA TGG GCC AG A TAT 3 0 

HWRRL IRGATRDDDSGPYNY 30 

CAC TGG CGA CGG CTG ATC AGA GGT GCA ACC AGG GAT GAT GAT TCA GGG CCA TAC AAC TAT 90 

SSLLACGRKS SQI PKLSGRH 50 

TCC TCG TTG CTC GCC TGT GGG CGC AAG TCC TCT CAG ATC CCT AAA CTG TCA GGA AGG CAC 150 

RIVVPHIQPFKDEYEKFSGA 70 

CGG ATT GTT GTT CCC CAC ATC CAG CCC TTC AAG GAT GAG TAT GAG AAG TTC TCC GGA GCC 210 

YVNNRIRTTKYTLLNFVPRN 90 

TAT GTG AAC AAT CGA ATA CGA ACA AC A AAG TAC AC A CTT CTG AAT TTT GTG CCA AGA AAT 270 

LFEQFHRAAS LYFLFLVVLN 110 

TTA TTT GAA CAA TTT CAC AGA GCT GCC AGT TTA TAT TTC CTG TTC CTA GTT GTC CTG AAC 330 

WVPLVEAFQKEITML PLVVV 130 

TGG GTA CCT TTG GTA GAA GCC TTC CAA AAG GAA ATC ACC ATG TTG CCT CTG GTG GTG GTC 3 90 

LTIIAIKDGLEDYRKYKIDK 150 

CTT ACA ATT ATC GCA ATT AAA GAT GGC CTG GAA GAT TAT CGG AAA TAC AAA ATT GAC AAA 4 50 

QINNLITKVYSRKEKKYIDR 170 

CAG ATC AAT AAT TTA ATA ACT AAA GTT TAT AGT AGG AAA GAG AAA AAA TAC ATT GAC CGA 510 

CWKDVTVGDFIRLSCNEVIP 190 

TGC TGG AAA GAC GTT ACT GTT GGG GAC TTT ATT CGC CTC TCC TGC AAT GAG GTC ATC CCT 570 

ADMVLLFSTDPDGICHI ETS 210 

GCA GAC ATG GTA CTA CTC TTT TCC ACT GAT CCA GAT GGA ATC TGT CAC ATT GAG ACT TCT 630 

GLDGESNLKQRQVVRGYAEQ 230 

GGT CTT GAT GGA GAG AGC AAT TTA AAA CAG AGG CAG GTG GTT CGG GGA TAT GCA GAA CAG 690 

DSEVDPEKFSSRIECESPNN 250 

GAC TCT GAA GTT GAT CCT GAG AAG TTT TCC AGT AGG ATA GAA TGT GAA AGC CCA AAC AAT 750 

DLSRFRGFLEHSNKERVGLS 270 

GAC CTC AGC AGA TTC CGA GGC TTC CTA GAA CAT TCC AAC AAA GAA CGC GTG GGT CTC AGT 810 

KENLLLRGCTIRNTEAVVGI 290 

AAA GAA AAT TTG TTG CTT AGA GGA TGC ACC ATT AGA AAC ACA GAG GCT GTT GTG GGC ATT 870 

VVYAGHETKAMLNNSGPRYK 310 

GTG GTT TAT GCA GGC CAT GAA ACC AAA GCA ATG CTG AAC AAC AGT GGG CCA CGG TAT AAG 930 
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RSKLERRANTDVLWCVMLLV 330 

CGC AGC AAA TTA GAA AGA AGA GCA AAC AC A GAT GTC CTC TGG TGT GTC ATG CTT CTG GTC 990 

IMCLTGAVGHG IWLSRYEKM 350 

ATA ATG TGC TTA ACT GGC GCA GTA GGT CAT GGA ATC TGG CTG AGC AGG TAT GAA AAG ATG 1050 

HFFNVPEPDGH I ISPLLAGF 370 

CAT TTT TTC AAT GTT CCC GAG CCT GAT GGA CAT ATC ATA TCA CCA CTG TTG GCA GGA TTT 1110 

Y MFWTMI I LLQVLI PISLYV 390 

TAT ATG TTT TGG ACC ATG ATC ATT TTG TTA CAG GTC TTG ATT CCT ATT TCT CTC TAT GTT 1170 

SIEIVKLG*QIYFIQSDVDFY 410 

TCC ATC GAA ATT GTG AAG CTT GGA CAA ATA TAT TTC ATT CAA AGT GAT GTG GAT TTC TAC 123 0 

DSIVQCRALNIAEDLG 430 

GAT TCT ATT GTT CAG TGC CGA GCC CTG AAC ATC GCC GAG GAT CTG GGA 1290 

LFSDKTGTLTENKMVF 450 

CTC TTT TCC GAT AAG ACA GGA ACC CTC ACT GAG AAT AAG ATG GTT TTT 1350 

RRCSVAGFDYCHEENARRLE 470 

CGA AGA TGT AGT GTG GCA GGA TTT GAT TAC TGC CAT GAA GAA AAT GCC AGG AGG TTG GAG 1410 

SYQEAVSEDEDFIDTVSGSL, 490 

TCC TAT CAG GAA GCT GTC TCT GAA GAT GAA GAT TTT ATA GAC ACA GTC AGT GGT TCC CTC 1470 

SNMAKPRAPS C RTVHNGPLG 510 

AGC AAT ATG GCA AAA CCG AGA GCC CCC AGC TGC AGG ACA GTT CAT AAT GGG CCT TTG GGA 153 0 

NKPSNHLAGS S FTLGSGEGA 530 

AAT AAG CCC TCA AAT CAT CTT GCT GGG AGC TCT TTT ACT CTA GGA AGT GGA GAA GGA GCC 1590 

SEVPHSRQAAFSSPIETDVV 550 

AGT GAA GTG CCT CAT TCC AGA CAG GCT GCT TTC AGT AGC CCC ATT GAA ACA GAC GTG GTA 1650 

PDTRLLDKFSQITPRLFMPL 570 

CCA GAC ACC AGG CTT TTA GAC AAA TTT AGT CAG ATT ACA CCT CGG CTC TTT ATG CCA CTA 1710 

DETIQNPPMETLYIIDFFIA 590 

GAT GAG ACC ATC CAA AAT CCA CCA ATG GAA ACT TTG TAC ATT ATC GAC TTT TTC ATT GCA 177 0 

LAICNTVVVSAPNQPRQKIR 610 

TTG GCA ATT TGC AAC ACA GTA GTG GTT TCT GCT CCT AAC CAA CCC CGA CAA AAG ATC AGA 183 0 

HPSLGGLPIKSLEEIKSLFQ 630 

CAC CCT TCA CTG GGG GGG TTG CCC ATT AAG TCT TTG GAA GAG ATT AAA AGT CTT TTC CAG 1890 

RWSVRRSSSPSLNSGKEPSS 650 

AGA TGG TCT GTC CGA AGA TCA AGT TCT CCA TCG CTT AAC AGT GGG AAA GAG CCA TCT TCT 1950 

GVPNAFVSRL PLFSRMKPAS 670 

GGA GTT CCA AAC GCC TTT GTG AGC AGA CTC CCT CTC TTT AGT CGA ATG AAA CCA GCT TCA 2010 

PVEEEVSQVCESPQCSSSSA 690 

CCT GTG GAG GAA GAG GTC TCC CAG GTG TGT GAG AGC CCC CAG TGC TCC AGT AGT TCA GCT 2070 
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I U N E K M 

j: : AAT GAA AAA ATG 

nl 0 I Q Y 

! ~ CAG ATT CAG TAC 



CCTETEKQHGD 
TGC TGC ACA GAA ACA GAG AAA CAA CAC GGT GAT 

SLPGQPLACNL 
TCC CTC CCT GGA CAG CCA TTG GCC TGC AAC CTG 

AALVYAARAYQ 
GCG GCC TTA GTG TAT GCC GCC AGG GCT TAC CAA 

QVMVDFAALGP 
CAG GTC ATG GTG GAC TTT GCT GCT TTG GGA CCA 

PFDSVRKRMSV 
CCC TTT GAC TCA GTA AGA AAA AGA ATG TCT GTT 

VVVYTKGADSV 
GTT GTG GTG TAT ACG AAA GGC GCT GAT TCT GTG 

PDGASLEKQQM 
CCA GAT GGA GCA AGT CTG GAG AAA CAA CAG ATG 

LDDYAKQGLRT 
TTG GAT GAC TAT GCC AAA CAA GGC CTT CGT ACT 

DTEYAEWLRNH 
GAC ACT GAA TAT GCA GAG TGG CTG AGG AAT CAT 

REELLLESAMR 
AGG GAA GAA TTA CTA CTT GAA TCT GCC ATG AGG 

ATGIEDRLQEG 
GCT ACT GGC ATT GAA GAC CGT CTG CAG GAG GGA 

KAGIKIWMLTG 
AAA GCG GGC ATC AAG ATC TGG ATG CTG ACA GGG 

AYACKLLEPDD 
GCT TAT GCA TGC AAA CTA CTG GAG CCA GAT GAC 

KDACGMLMST I 
AAA GAT GCC TGT GGG ATG CTG ATG AGC ACA ATT 

ALPEQVSLSED 
GCC CTG CCA GAG CAA GTG TCA TTA AGT GAA GAT 

SGLRAGLIITG 
TCA GGG TTA CGA GCT GGA CTC ATT ATC ACT GGG 

SLQKQFLELTS 
AGT CTG CAA AAG CAG TTC CTG GAA CTG ACA TCT 

ATPLQKSEVVK 
GCC ACA CCG CTG CAG AAA AGT GAA GTG GTG AAA 

TLAIGDGANDV 
ACC CTT GCT ATT GGT GAT GGT GCC AAT GAT GTT 



AGLLNGKAE 710 

GCA GGC CTC CTG AAT GGC AAG GCA GAG 2130 

CYEAESPDE 730 

TGT TAT GAG GCC GAG AGC CCA GAC GAA 2190 

CTLRSRTPE 750 

TGC ACT TTA CGG TCT CGG ACA CCA GAG 2250 

LTFQLLHIL 770 

TTA ACA TTT CAA CTC CTA CAC ATC CTG 2310 

VVRHPLSNQ 790 

GTG GTC CGA CAC CCT CTT TCC AAT CAA 2370 

IMELLSVAS 810 

ATC ATG GAG TTA CTG TCG GTG GCT TCC 2430 

IVREKTQKH 830 

ATA GTA AGG GAG AAA ACC CAG AAG CAC 2490 

LCIAKKVMS 850 

TTA TGT ATA GCA AAG AAG GTC ATG AGT 2550 

FLAETSIDN 870 

TTT TTA GCT GAA ACC AGC ATT GAC AAC 2610 

LENKLTLLG 890 

TTG GAG AAC AAA CTT ACA TTA CTT GGT 2670 

VPESIEALH 910 

GTC CCT GAA TCT ATA GAA GCT CTT CAC 273 0 

DKQETAVNI 930 

GAC AAG CAG GAG ACA GCT GTC AAC ATA 2790 

KLFILNTQS 950 

AAG CTT TTT ATC CTC AAT ACC CAA AGT 2850 

LKELQKKTQ 970 

TTG AAA GAA CTT CAG AAG AAA ACT CAA 2910 

LLQPPVPRD 990 

TTA CTT CAG CCT CCT GTC CCC CGG GAC 297 0 

KTLEFALQE 1010 

AAG ACC CTG GAG TTT GCC CTG CAA GAA 3030 

WCQAVVCCR 1030 

TGG TGT CAA GCT GTG GTC TGC TGC CGA 3090 

LVRSHLQVM 1050 

TTG GTC CGC AGC CAT CTC CAG GTG ATG 3150 

SMIQVADIG 1070 

AGC ATG ATA CAA GTG GCA GAC ATT GGG 3210 
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IGVSGQEGMQAVMASDFAVS 1090 

ATA GGG GTC TCA GGT CAA GAA GGC ATG CAG GCT GTG ATG GCC AGT GAC TTT GCC GTT TCT 3270 

QFKHLSKLLLVHGHWCYTRL 1110 

CAG TTC AAA CAT CTC AGC AAG CTC CTT CTT GTC CAT GGA CAC TGG TGT TAT ACA CGG CTT 3330 

SNMI LYFFYKNVAYVNLLFW 1130 

TCC AAC ATG ATT CTC TAT TTT TTC TAT AAG AAT GTG GCC TAT GTG AAC CTC CTT TTC TGG 3390 

YQFFCGFSGTSMTDYWVLIF 1150 

TAG CAG TTC TTT TGT GGA TTT TCA GGA ACA TCC ATG ACT GAT TAC TGG GTT TTG ATC TTC 34 50 

FNLLFTSAPPVIYGVLEKDV 1170 

TTC AAC CTC CTC. TTC ACA TCT GCC CCT CCT GTC ATT TAT GGT GTT TTG GAG AAA GAT GTG 3 510 

SAETLMQL PEL.YRSGQKSEA 1190 

TCT GCA GAG ACC CTC ATG CAA CTG CCT GAA CTT TAC AGA AGT GGT CAG AAA TCA GAG GCA 3570 

YLPHTFWITLLDAFYQSLVC 1210 

TAC TTA CCC CAT ACC TTC TGG ATC ACC TTA TTG GAT GCT TTT TAT CAA AGC CTG GTC TGC 3630 

FFVPYFTYQGSDTDI FAFGN 1230 

TTC TTT GTG CCT TAT TTT ACC TAC CAG GGC TCA GAT ACT GAC ATC TTT GCA TTT GGA AAC 3690 

PLNTAALF IVLLHLVIESKS 1250 

CCC CTG AAC ACA GCC GCT CTG TTC ATC GTT CTC CTC CAT CTG GTC ATT GAA AGC AAG AGT 3750 

LTWIHLLVIIGSILSYFLFA 1270 

TTG ACT TGG ATT CAC TTG CTG GTC ATC ATT GGT AGC ATC TTG TCT TAT TTT TTA TTT GCC 3810 

IVFGAMCVTCNPPSNPYWIM 1290 

ATA GTT TTT GGA GCC ATG TGT GTA ACT TGC AAC CCA CCA TCC AAC CCT TAC TGG ATT ATG 3870 

QEHMLDPVFYLVC ILTTS IA 1310 

CAG GAG CAC ATG CTG GAT CCA GTA TTC TAC TTA GTT TGT ATC CTC ACG ACG TCC ATT GCT 3 930 

LLPRFVYRVLQGSLFPSPIL 1330 

CTT CTG CCC AGG TTT GTA TAC AGA GTT CTT CAG GGA TCC CTG TTT CCA TCT CCA ATT CTG 3 990 

RAKHFDRLTPEERTKALKKW 1350 

AGA GCT AAG CAC TTT GAC AGA CTA ACT CCA GAG GAG AGG ACT AAA GCT CTC AAG AAG TGG 4050 

RGAGKMNQVTSKYANQSAGK 1370 

AGA GGG GCT GGA AAG ATG AAT CAA GTG ACA TCA AAG TAT GCT AAC CAA TCA GCT GGC AAG 4110 

SGRRPMPG PSAVFAMKSAT S 1390 

TCA GGA AGA AGA CCC ATG CCT GGC CCT TCT GCT GTA TTT GCA ATG AAG TCA GCA ACT TCC 4170 

CAIEQGNL.SLCETALDQGYS 1410 

TGT GCT ATT GAG CAA GGA AAC TTA TCT CTG TGT GAA ACT GCT TTA GAT CAA GGC TAC TCT 4230 

ETKAFEMAGPSKGKES * 1427 

GAA ACT AAG GCC TTT GAG ATG GCT GGA CCC TCC AAA GGT AAA GAA AGC TAG 4281 

ATACCCTCCTTGGAGTTGCAAGTATTCTTTCAAGGTTGGAAGAGGG 

CTTGTTTTTCCATAAGGGACATGAGCATTTTACTAGGCTTGGAA 
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ATACATTTGTGATAGAGGGCTAGAGTTTGACCTAGAGA 
TTTGTAAAACTTTTTGGATTTTGTAAAAGCATTTTC 

ATATATTTATTTTACTAGGAGATCTTATATTCTAGGGAAATGCTTTAAATC 
AAAAAAGTAGTTTTTAATACATTGGTTAGGACTCAGAGGAAATACGGAAAAJ^ 
AAATCCCAAGAGCCTTTTAAACAACAAGGTACCTAAAATAGGGTATAAT^ 
TAATAGCTTTTTATTTCCTATGGGAAGATGCTTTTGG 

TGTTTTCCTGAGGTGGAGCCTTCATTGGAAAGGGGAAAGAGGGATTCTAGGGTTTCATC 
TCTGTCAGGTTCCAATCAAGAGAAGACCTTTTATGAGATCTGCCTCTGTAT^ 
ATAGGTCAGGCAGACATCAGCTCAGCCTGTGGCCCATTGGGTGATTTCCTGTATTTTAA^ 
AAGTGATACAATCAATTTCAAAACAATCTTCCAGAGACCACTTGAAGGTTCATA 

C AGGTGTTGG AGCCTCTAAAATATG AGAT ATAAAC AGAAAC TAATAC AAGTTGTTCTCTGG AGGTTTC TATGAGGTTCT 
TAGAAAAATTTGGTTTTAAAATCATTTGAGGACA 

GGCTTTAGTTGTAACAAACGATTTTATTCTAAGTAAGGCCAGGTGCTA 

TTTGAAGTCCTATCTCTATTTATTATATTTGAAAGTTGTCAGCCACC 

GCTCATATGCAATGTCTACATCAAGGTCTTCTTAATGACTATT 

TTTTGGTCTGACATTTTTGTAGCCTTCTGTTATTATTGGAAATAGTCTCTTACATAAGCTGATTT 

ATCTCACATAGCTAATGGAAGTTGCTTTCTGCTTTCTTATGACTGTTTTTATAAATAAACTGTTTC 

AAAAAAAAAGGGCGGCCGC 
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Protein Family / Domain Matches, HMMer version 2 



Searching for complete domains in PFAM 

hmmpfam - search a single seq against HMM database 

HMMER 2.1.1 (Dec 1998) 

Copyright (C) 1992-1998 Washington University School of Medicine 

HMMER is freely distributed under the GNU General Public License (GPL) . 

HMM file: /prod/ddm/seqanal/PFAM/pf am6 . 4/Pf am 

Sequence file : /prod/ddm/wspace/orf anal/oa-script . 14482 . seq 

Query: 67102 

Scores for sequence family classification (score includes all domains) : 
Model Description Score E-value N 



Hydrolase haloacid dehalogenase-like hydrolase 
1 

DUF6 Integral membrane protein DUF6 
1 



1.5 0.17 
-24.6 9.4 



Parsed for domains: 

Model Domain seq-f seq-t hmm-f hmm-t score E-value 



Hydrolase 1/1 432 1077 1 184 [] 1.5 0.17 

DUF6 1/1 1127 1271 .. 1 126 [] -24.6 9.4 

Alignments of top-scoring domains: 

Hydrolase: domain 1 of 1, from 432 to 1077: score 1.5, E = 0.17 

*->ikawFDkDGTLtdgkeppiaeaivealrelgl . . . .apleevekll 
i + + Dk+GTLt + + + ++ v+ ++ +++++++ e+ 
67102 432 iQYLFSDKTGTLTEN-KMVFRRCSVAGFDYCHEenarRLESYQEAVS 477 

grgl.g. . erilleggltaell 

+ + +++ ++1+ +++ ++ + ++ +++ + +++++++ +++ + +++ 
67102 478 EDEDf IdtVSGSLSNMAKPRAPscrtvhngplgnkpsnhlagssf tlgsg 527 



++ ++ +++ + +++ +++ ++++ + + + +++ + +++ + + 
67102 528 egasevphsrqaafsspietdwpdtrlldkfsqitprlfmpldetiqnp 577 



+ + + + ++++++++ + + + + ++ + ++ ++ ++ 

67102 578 pmetlyiidf fialaicntvwsapnqprqkirhpslgglpiksleeiks 627 



+ + +++++++ +++++++++ + + + + + + + ++ ++ + + + + 
67102 628 lfqrwsvrrssspslnsgkepssgvpnafvsrlplf srmkpaspveeevs 677 



+++ ++ + + +++++ + + + + + + + ++ ++ + + + + + 
67102 678 qvcespqcssssacctetekqhgdagllngkaeslpgqplacnlcyeaes 727 



+++ + + ++++++ + + + + ++++++ 

67102 728 pdeaalvyaarayqctlrsrtpeqvmvdfaalgpltfqllhilpfdsvrk 777 



+ + +++ ++ + + + + + + + ++++ + + + + + + + 

67102 778 rmsvwrhplsnqvwytkgadsvimellsvaspdgaslekqqmivrekt 827 



+ + + + + + ++ ++ ++++ + +++ +++ + + + + + + 
67102 828 qkhlddyakqglrtlciakkvmsdteyaewlrnhf laetsidnreellle 877 
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Id. evlglial .dklypgarealkaLkerGikvailTngdr .na 

+ + ++ +lg+ +d 1 +g++e +-*-aL+ + +Gik+ + + lT+ + + ++a 
67102 878 samrleNKlTLLGATGIeDRLQEGVPESIEALHKAGIKIWMLTGDKQeTA 927 

ealle. . . algla . If daivdsdevggvgpvwgKPkpeif llalerlgv 
+ ++ + + 1 + ++++++++ +g + i+++++++ + 

67102 928 VNI AYackLLE PDdKLFI LNTQS KDA CGMLMSTILKELQKKTQA 971 

kpeevg 

pe+v+ +++ +++ +++++ + + ++++ + +++ +++ + 
67102 972 LPEQVSlsedllqppvprdsglragliitgktlefalqeslqkqf lelts 1021 

p . kvlmvGDginDapalaaAGvgv 

+ + ++ ++++ + +++ +l++GDg nD+ + + A++g+ 

67102 1022 wcqawccratplqksewklvrshlQvMTLAIGDGANDVSMIQVADIGI 1071 

amgngg<-* 

67102 1072 GVSGQE 1077 

DUF6: domain 1 of 1, from 1127 to 1271: score -24.6, E =9.4 

*->fiWalytvfskklle. . spltf tawrf liagilllilllf Ikkgppl 
+++ ++++ +++ ++ + + ++ +1+ + ++ ++ + ++ 
67102 1127 LLFWYQFFCGFSGTSmtDYWVLIFFNLLFTS APPVIYGVLEKDV 1170 

lallslkilallylgilgtalgyllyf 

+ ++ + ++ +++++++ + + + 1 +y ++++++++y+ y 
67102 1171 saetlmqlpelyrsgqkseaylpHTFWITL-LDAFYQSLVCFFVPYFTYQ 1219 

yalkyvsaskasvlsslsPvf tlilsvllLgEkltlkqllGivlillGvl 
+ ++ + +++ +f+++l ++ +lt+++ll i+ ++1 + + 
67102 1220 - - GSDTV I FAFGNPLNTAALFI VLLHLVI ES KSLTW IHLLVI IGSILSYF 1267 

lisl<-* 
1 + 

67102 1268 LFAI 1271 
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CLUSTAL W (1.74) multiple sequence alignment 



Fbh67102FL 
mouseATBA 



Fbh67102FL 
mouseATBA 



Fbh67102FL 
mouseATBA 



Fblifi7102FL 
mouseATBA 



MTEALQ WAR YHWRRL I RGATRDDDSG P YN YS S LLA - CGRKS SQ I PKLSGRHR I WP 

MERELPAAEESASSGWRRPRR- - RRWEGRTRTVRSNLLPPLGTEDSTIGAPKGERLLMRG 
* . * *** * * . * ** * • * * * . * - 

» . * • ■ « • • • • . * • . 

T^v 

H I QPFKDEYE KFSGAY VNNR I RTTKYTLLNFV PRNLFEQFHRfi ASLYFLFLW LNWVPLVl 
C I QHLAD NRLKTTKYTLLSFI jPKNLFEQFHRLANVYFVFIALIj WFVPAV 



*★..*★***** *.*.******** * .**.* 



.**.** * 



EAFQK^ TMLPLVVVLTIIAIIfflG^DYRKYKIDKQINNLITKVYSRKEKKYIDRCWKDV 

[NAFQPGIAI^PVLFIIAV^^ 
. * * * ... ★.. .*.. * * * * * * * ... *..**.* *.**.****..* * * . . 

tvgdfirlscnevipadmvllfstdpdgichJqe^c^^gesnlkqrqvwgyaeqdseto 



RVGDFVRLCCNE 1 1 PAD I LLLSSSDPDGLCH3 E' 




GETNLKRRQWRGFSELVSEFN 



****.** *★*.****.-** *.****.*★*★*. **★*.***-*★****..* ★ * 



Fbhj57102FL 
mo\fiseATSA 



FbK€7102FL 
mouseATBA 



Fb^67102FL 
mouseATBA 



Fbh67102FL, 
mouseATBA 



PEKFSSRIECESPNNDLSRFRGFLEHSNKERVGLSKENLLLRGCTIRNTEAWGIWYAG 
P LT FTS VI EC E KPNNDLS R FRG Y I MH SNGEKAG LHKENLLLRGCT I RNTEAVAG I V I YAG 
* *.* **** **********.. *** *. ** ***************** ***.*** 

HETKAMLNNSG PR YKRS KLERRANTEjJ/LWC VMLLV I MCLTGAVGHG I WLjS R Y - E KMHFFN 
HETKALIJ^SGPRYKRSOLEROMNCI JVLWCVLLLVCISLFSAVGH 

*****.***********.***. * ********** :% * # *****;* : ** ** ; *. 

VPE PDGH IIS E fcLAGF YMF WTMI I LLQVL I P I S LYV fc I E I V KLGQ I YF I QSDVD FYNEKM 
VPESTCSSLSPATAAVYSF lFTMIIVLQVLIPISLYVSIEIV{ ^CQVYFINQDIELYDEET 
***** .** * m * *.****.*****************. *:***:,*;:;*;*: 

DSIVQCRAUtflAEDLGQIQYljSsfi^^ 
DSOLOCRALNITEDLGQIKY3| ^ 

** .*******.******.* .** ****************.*.*.. * *. **.** ★** 



Fbh67102FL, 
mouseATBA 



AVS EDEDF I DTVSGS LSNMAKPRAPS CRTVHNGPLGNKPSNHLiAGS S FTLGS - GEGAS EV 

ADSEEEEWS KV-GTIS HRGSTG - - S HQ S - 1 WMTHKTQS I KSHRRTGSRAEAKRAS 

***.★.. **..* * . *. - . * ** * 



Fbh67102FL 
mouseATBA 



Fbh67102FL 
mouseATBA 



PHSRQAAFS S PI ETDWPDTRLLDKFSQ I TPRLFMPLDET I QNPPM El^YIIDFFI 

MLSKHTAFSSPMEKDITPDPKLLEKVSECD-R-FLAIARHQEHPLAHLSPELSDVFDFFI 

*...*****.**.**.**.**. * * . ..* ★ ..**** 

. . . .... . . ... .... .. .« 

~Xr*S 

ALAICNTVVV^^ 

ALT I CNTWVTS PDQ PRQKVRVRFELKS P VKT I ED FLRRFTPSRLASGCSS IGNLST 

*★.****★**..*.*****.* *.*..*. * . * *. 



Fbh67102FL 
mouseATBA 



Fbh67102FL 
mouseATBA 



S-GVPNAFVSRLPLFSRMKPASPVEEEVSQVCESPQCSSSSACCTETEKQHGDAGLLNGK 
SKSSHKSGSAFLPSLSQDSMLLGLEEKIjGQT- -APSIASNGYASQAGQEESWASDCTT- - 

* . . . * » tA- - . * * « . ^ * ** • * ^ . • » • ^ 



AESLPG-Q- PIACN-LCYEAESPDEAALWAARAYQCTLRSRTPEqVMVDFAALGPLTFQ 

DQKCPGEQREQQEGELRYEAESPDEAALVYAARAYNCALVDRLHDQVSVELPHI^RIjTFE 
** * * ******************.*.* * -** *.• ** *** . 



Fbh67102FL 
mouseATBA 



LUmlPFDSVRKRMSVVVRHPLSNQVVVYTKGM 

LLiHTLGFDS I RKRMS WI RHPLTDE I NVYTKGADS WMDLLLPCS SDDARG - RHQKKI RS 



*** * ***.*******.**** 



********* .* .** 



Fbh67102FLr 



KTQKHLDDYAKQGLRTLCIAKKVMSDTEYAEWLRNHFLAETSIDNREELLLESAMRLENK 
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nouseAT5A 



Fbh67102FL 
nouseATSA 



Fbh67102FL 
nouseAT5A 



Pbh67102FL 
TiouseATBA 



FbH#7102FL 
moiSeATSA 

Fbhp7102FL 
mo«!3eAT5A 

Fbf87102FL 
mowf eAT5A 

Fbffj57102FL 
mou&eATSA 



Fbh67102FL 
mouseAT5A 



Fbh67102FL 
mouseAT5A 



Fbh67102FL 
mouseATSA 



Fbh67102FL 
mouseATSA 



KTQNYLNLYAVEGLRTLCIAKRVXiSKEEYACWLQSHIEAEASVESREELLFQSAVRLETN 
***..*. ** .***★***★*.*.* *** **. *• **:*::.*****••**-*** - 

LTLLGATG I EDRIiQEGVPES I EALHKAG I KI WMLTGDKQETAVN I AYACKLLEPDDKLF I 
LHLLGATGIEDRLQEGVPETIAKLRQAGLQIWVLTGDKQETAINIAYACKLLDHGEEVIT 

* *****************.* *. ; ** ; .**.********* ; ********* : 

LNTQSKDACGMLMSTILKELQKKTQALPEQVS LSEDLLQPPVPR- -DSGLRAGLI IT 

LNADSQEACAALLDQCLSYVQSRNPRSTLQNSESNLSVGFSFNPVSTSTDASPSPSLVID 
**..*. : ** v * : ★ „ * * * * .: ** . *:. . * : * 

GKTLEFALQESLQKQFLELTSWCQAWCCRATPLQKSEW^^ 
GRSLAYALEKSLEDKFLFLAKQCRSVLCCRSTPL^ 

* : : * :**;;**:.:** * : . * :: * : *** : ****** ******* : * ;< *********** 

yz - Pk^KcAyj.U T^,v^ " 

VsKlDVADIGIGVSGQEGMQAVMASDFAVSQFKHLSK^ 

VSgJip VAD VG VG I SGQEGMQAVMAS DFA VPRFRY LERLL I VHGHWC Y S RLANMVL Y F F Y K 

********.***.**************** .*--* .**•*******.**.*★.★***★* 

••••••• 

nr^y 

NVAYWLLFWYQFFCGFSGTSMTEj^ 

NTMSVGLLFWFQFYCGFSASAMII ^^LIFFNLLFSSLPQLVTGVp DKDVPADMLLREPQ 

* * ****.★*.**★* -.* * * ★★*★****.* * ***.*** *. *«. *. 

c 

LYRSGQKSEAYLPHTFWITLLDAFYQSLVCFFVPYFTYQGSDTDIEFAFGNPLNTAALFIV 
LYKSGONMEEYRPRAFWLNIWDAAFQSLVC^ 

★*•***. * * ★.***. ..** .******★.**•.* ** *.*..* *. - *** 

• • • • • • • • • « • • • * 



* * * * * * . * . * * . . * * . . * *... . * ** ******* ** 



LVCILTTSIALLPRFVlr RVLQGSLFPS PILR AKHFDRLTPEERTK 

LTCLIAPIAALLPRLFH KALQGSLFPTQLQLGRQLAKKPIiNKFSDPKETFAQGQPPGHSE 
*.*;::. * * * * * : 9 : : *******. * : : : * : * ::: 

ALKKWR G AGKMNQVTS KYAN - - QS AGK- SGRRP - MPG - PS AVFA - MKS 

TELSERKTMGPFETLPRDCASQASQFTQQLTCSPEASGEPSAVDTNMPLRENTLLEGLGS 
: . * * *.:•*.*.:: :::*:*. - * * . : : : : * 

ATS - CAI EQGNLS - LCET - ALDQGYSETKAFEMAG PSKGKES 

QASGSSMPRGAISEVCPGDSKRQSSSASQTARLSSLFHLPSFGSL1WISSLSLASGLGSV 
-* ...*•*•* . * * ★** 



LQLSGSSLQMDKQDGEFLSNPPQPEQDLHSFQGQVTGYL 
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Input file Fbh44181pat .seq; Output File Fbh44181pat . tra 
Sequence length 7221 

GCCGCGGGATGGGAACGCGGCGCGGGGAGTGAGGCAGTGGCGGCGGCGGCGGTAAGCGGAACTTCGGCCCGAGGGGCTC 

GCCCGCTCCCGCCTCTGTCTTGTCGGCCTCCACCTGCAGCCCCGCGGCCCCCGCGCCCCGCGGGACCCGGACGGCGACG 

MWRWIRQQLGFDPPHQS 17 

ACGGGGGA ATG TGG CGC TGG ATC CGG CAG CAG CTG GGT TTT GAC CCA CCA CAT CAG AGT 51 

DTRTIYVAHRFPQNGLYTPQ 37 

GAC ACA AGA ACC ATC TAC GTA GCC CAC AGG TTT CCT CAG AAT GGC CTT TAC ACA CCT CAG 111 

KFIDNRI ISS KYTVWNFVPK 57 

AAA TTT ATA GAT AAC AGG ATC ATT TCA TCT AAG TAC ACT GTG TGG AAT TTT GTT CCA AAA 171 

NLFEQFRRVANFYFLIIFLV 77 

AAT TTA TTT GAA CAG TTC AGA AGA GTG GCA AAC TTT TAT TTT CTT ATT ATA TTT TTG GTT 231 

QLMIDTPTSPVTSGLPLFFV 97 

CAG CTT ATG ATT GAT ACA CCT ACC AGT CCA GTT ACC AGT GGA CTT CCA TTA TTC TTT GTG 291 

ITVTAIKQGYEDWLRHNSDN 117 

ATA ACA GTA ACT GCC ATA AAG CAG GGA TAT GAA GAT TGG TTA CGG CAT AAC TCA GAT AAT 3 51 

EVNGAPVYVVR S GGLVKTR S 137 

GAA GTA AAT GGA GCT CCT GTT TAT GTT GTT CGA AGT GGT GGC CTT GTA AAA ACT AGA TCA 411 

KNIRVGDIVR IAKDEIFPAD 157 

AAA AAC ATT CGG GTG GGT GAT ATT GTT CGA ATA GCC AAA GAT GAA ATT TTT CCT GCA GAC 471 

LVLLSSDRLDGSCHVTTASL 177 

TTG GTG CTT CTG TCC TCA GAT CGA CTG GAT GGT TCC TGT CAC GTT ACA ACT GCT AGT TTG 531 

DGETNLKTHVAVPETALLQT 197 

GAC GGA GAA ACT AAC CTG AAG ACA CAT GTG GCA GTT CCA GAA ACA GCA TTA TTA CAA ACA 591 

VANLDTLVAVI ECQQPEADL 217 

GTT GCC AAT TTG GAC ACT CTA GTA GCT GTA ATA GAA TGC CAG CAA CCA GAA GCA GAC TTA 651 

YRFMGRMIITQQMEEIVRPL 237 

TAC AGA TTC ATG GGA CGA ATG ATC ATA ACC CAA CAA ATG GAA GAA ATT GTA AGA CCT CTG 711 

GPESLLLRGARLKNTKEI FG 257 

GGG CCG GAG AGT CTC CTG CTT CGT GGA GCC AGA TTA AAA AAC ACA AAA GAA ATT TTT GGT 771 

VAVYTGMETKMALNYKSKSQ 277 

GTT GCG GTA TAC ACT GGA ATG GAA ACT AAG ATG GCA TTA AAT TAC AAG AGC AAA TCA CAG 831 

KRSAVEKSMNTFL I IYLVI h 297 

AAA CGA TCT GCA GTA GAA AAG TCA ATG AAT ACA TTT TTG ATA ATT TAT CTA GTA ATT CTT 891 

ISEAVISTI LKYTWQAEEKW 317 

ATA TCT GAA GCT GTC ATC AGC ACT ATC TTG AAG TAT ACA TGG CAA GCT GAA GAA AAA TGG 951 

DEPWYNQKTEHQRNSSKILR 337 

GAT GAA CCT TGG TAT AAC CAA AAA ACA GAA CAT CAA AGA AAT AGC AGT AAG ATT CTG AGA 1011 
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F ISDFLAFLVLYNFI IPISL 357 

TTT ATT TCA GAC TTC CTT GCT TTT TTG GTT CTC TAC AAT TTC ATC ATT CCA ATT TCA TTA 1071 

YVTVEMQKFLGS FFIGWDLD 377 

TAT GTG AC A GTC GAA ATG CAG AAA TTT CTT GGA TCA TTT TTT ATT GGC TGG GAT CTT GAT 1131 

LYHEESDQKAQVNTSDLNEE 397 

CTG TAT CAT GAA GAA TCA GAT CAG AAA GCT CAA GTC AAT ACT TCC GAT CTG AAT GAA GAG 1191 

LGQVEYVFTDKTGTLTENEM 417 

CTT GGA CAG GTA GAG TAC GTG TTT AC A GAT AAA ACT GGT ACA CTG AC A GAA AAT GAG ATG 1251 

QFRECS INGMKYQEINGRLV 437 

CAG TTT CGG GAA TGT TCA ATT AAT GGC ATG AAA TAC CAA GAA ATT AAT GGT AGA CTT GTA 1311 

PEGPTPDSSEGNLSYLSSLS 457 

CCC GAA GGA CCA ACA CCA GAC TCT TCA GAA GGA AAC TTA TCT TAT CTT AGT AGT TTA TCC 1371 

HLNNLSHLTTSSSFRTSPEN 477 

CAT CTT AAC AAC TTA TCC CAT CTT ACA ACC AGT TCC TCT TTC AGA ACC AGT CCT GAA AAT 1431 

ETELIKEHDLFFKAVSLCHT 497 

GAA ACT GAA CTA ATT AAA GAA CAT GAT CTC TTC TTT AAA GCA GTC AGT CTC TGT CAC ACT 1491 

VQISNVQTDCTGDGPWQSNL 517 

GTA CAG ATT AGC AAT GTT CAA ACT GAC TGC ACT GGT GAT GGT CCC TGG CAA TCC AAC CTG 1551 

APSQLEYYASSPDEKALVEA 537 

GCA CCA TCG CAG TTG GAG TAC TAT GCA TCT TCA CCA GAT GAA AAG GCT CTA GTA GAA GCT 1611 

AARIGIVF IGNSEETMEVKT 557 

GCT GCA AGG ATT GGT ATT GTG TTT ATT GGC AAT TCT GAA GAA ACT ATG GAG GTT AAA ACT 1671 

LGKLERYKLLHILEFDSDRR 577 

CTT GGA AAA CTG GAA CGG TAC AAA CTG CTT CAT ATT CTG GAA TTT GAT TCA GAT CGT AGG 1731 

RMSVIVQAPSGEKLLFAKGA 597 

AGA ATG AGT GTA ATT GTT CAG GCA CCT TCA GGT GAG AAG TTA TTA TTT GCT AAA GGA GCT 1791 

ESSILPKCIGGEIEKTRIHV 617 

GAG TCA TCA ATT CTC CCT AAA TGT ATA GGT GGA GAA ATA GAA AAA ACC AGA ATT CAT GTA 1851 

DEFALKGLRTLC IAYRKFTS 637 

GAT GAA TTT GCT TTG AAA GGG CTA AGA ACT CTG TGT ATA GCA TAT AGA AAA TTT ACA TCA 1911 

K EYEE IDKR I FEARTALQQR 657 

AAA GAG TAT GAG GAA ATA GAT AAA CGC ATA TTT GAA GCC AGG ACT GCC TTG CAG CAG CGG 1971 

EEKLAAVFQFI EKDLIL.LGA 677 

GAA GAG AAA TTG GCA GCT GTT TTC CAG TTC ATA GAG AAA GAC CTG ATA TTA CTT GGA GCC 2031 

TAVEDRLQDKVRETIEALRM 697 

ACA GCA GTA GAA GAC AGA CTA CAA GAT AAA GTT CGA GAA ACT ATT GAA GCA TTG AGA ATG 2091 

AGIKVWVLTGDKHETAVSVS 717 

GCT GGT ATC AAA GTA TGG GTA CTT ACT GGG GAT AAA CAT GAA ACA GCT GTT AGT GTG AGT 2151 
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LSCGHFHRTMN 
TTA TCA TGT GGC CAT TTT CAT AGA ACC ATG AAC 

DSECAEQLRQL 
GAC AGC GAG TGT GCT GAA CAA TTG AGG CAG CTT 

IQHGLVVDGTS 
ATT CAG CAT GGG CTG GTA GTG GAT GGG ACC AGC 

KLFMEVCRttCS 
AAA CTA TTT ATG GAA GTT TGC AGA AAT TGT TCA 

LQKAKVIRLIK 
CTG CAG AAA GCA AAA GTA ATA AGA CTA ATA AAA 

AVGDGANDVSM 
GCT GTT GGT GAT GGT GCT AAT GAC GTA AGC ATG 

I MGKEGRQAAR 
ATC ATG GGT AAA GAA GGA AGA CAG GCT GCA AGA 

KFLSKLLFVHG 
AAG TTC CTC TCC AAA TTG CTT TTT GTT CAT GGT 

L.VQYFFYKNVC 
CTT GTA CAG TAT TTT TTT TAT AAG AAT GTG TGC 

F Y C L F S Q QT L Y 
TTC TAC TGT TTG TTT TCT CAG CAA ACA TTG TAT 

ICFTSLPILIY 
ATT TGT TTT ACT TCC CTA CCT ATT CTG ATA TAT 

HVLQNKPTLYR 
CAT GTG TTA CAA AAT AAG CCC ACC CTT TAT CGA 

IKT FLYWTILG 
ATT AAA ACA TTT CTT TAT TGG ACC ATC CTG GGC 

GSYLLIGKDTS 
GGA TCC TAT TTA CTA ATA GGG AAA GAT ACA TCT 

NWTFGTLVFTV 
AAC TGG ACA TTT GGC ACT TTG GTC TTC ACA GTC 

ALETHFWTWIN 
GCT CTG GAA ACT CAT TTT TGG ACT TGG ATC AAC 

FYFVFSLFYGG 
TTT TAT TTT GTA TTT TCC TTG TTT TAT GGA GGG 

NMYFVFIQLLS 
AAT ATG TAT TTT GTG TTT ATT CAG CTC CTG TCA 

LMVVTCLFLDI 
CTC ATG GTT GTT ACA TGT CTA TTT CTT GAT ATC 



ILELINQKS 737 

ATC CTT GAA CTT ATA AAC CAG AAA TCA 2211 

ARRITEDHV 757 

GCC AGA AGA ATT ACA GAG GAT CAT GTG 2271 

LSLALREHE 777 

CTA TCT CTT GCA CTC AGG GAG CAT GAA 2331 

AVLCCRMAP 797 

GCT GTA TTA TGC TGT CGT ATG GCT CCA 2391 

ISPEKPITL 817 

ATA TCA CCT GAG AAA CCT ATA ACA TTG 24 51 

IQEAHVGIG 837 

ATA CAA GAA GCC CAT GTT GGC ATA GGA 2511 

NSDYAIARF 857 

AAC AGT GAC TAT GCA ATA GCC AGA TTT 2571 

HFYYIRIAT 877 

CAT TTT TAT TAT ATT AGA ATA GCT ACC 2631 

FITPQFLYQ 897 

TTT ATC ACA CCC CAG TTT TTA TAT CAG 2691 

DSVYLTLYN 917 

GAC AGC GTG TAC CTG ACT TTA TAC AAT 27 51 

SLLEQHVDP 937 

AGT CTT TTG GAA CAG CAT GTA GAC CCT 2811 

DISKNRLLS 957 

GAC ATT AGT AAA AAC CGC CTC TTA AGT 2871 

FSHAFIFFF 977 

TTC AGT CAT GCC TTT ATT TTC TTT TTT 2931 

LLGNGQMFG 997 

CTG CTT GGA AAT GGC CAG ATG TTT GGA 2991 

MVITVTVKM 1017 

ATG GTT ATT ACA GTC ACA GTA AAG ATG 3051 

HLVTWGS I I 1037 

CAT CTC GTT ACC TGG GGA TCT ATT ATA 3111 

ILWPFLGSQ 1057 

ATT CTC TGG CCA TTT TTG GGC TCC CAG 3171 

SGSAWFAII 1077 

AGT GGT TCT GCT TGG TTT GCC ATA ATC 3231 

IKKVFDRHL 1097 

ATA AAG AAG GTC TTT GAC CGA CAC CTC 3291 
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HPTSTEKAQLTETNAGIKCL 1117 
CAC CCT ACA AGT ACT GAA AAG GCA CAG CTT ACT GAA ACA AAT GCA GGT ATC AAG TGC TTG 3351 

DSMCCFPEGEAACASVGRML 1137 
GAC TCC ATG TGC TGT TTC CCG GAA GGA GAA GCA GCG TGT GCA TCT GTT GGA AG A ATG CTG 3411 

ERVIGRCSPTHISSSWSASD 1157 
GAA CGA GTT ATA GGA AGA TGT AGT CCA ACC CAC ATC AGC AGT TCA TGG AGT GCA TCG GAT 3471 

PFYTNDRSILTLSTMDSSTC 1177 
CCT TTC TAT ACC AAC GAC AGG AGC ATC TTG ACT CTC TCC ACA ATG GAC TCA TCT ACT TGT 3 531 

* 1178 
TAA 3534 

AGGGGCAGTAGTACTTTGTGGGAGCCAGTTCACCTCCTTTCCTAAAATTCAGTC 

AGCTCTGAAATTAATTTCCAAAATCTTTGTAGTAGTTCATACCCAC 

TAGTACAAGCCCCTCCCAACACCCTTAATTTGAATCTGAACATGTTAAAAT 

TTTGTCTGGTTTGTCCCTTGTGCTTATGGGA 

AAATGTAGAAAAAAGAGAGAAATCTTAGTAAAGAGTATTTTTTAGTATTAGCTTG 
TGCTTCTGTAAATTATGCTGAAAGTTTGCCTTGAGAAC 

AAAAGTTAATGTGAAT ACTGAGG AATTTTGGTC C CTC AGTGAC CTGTGTTGTTAATTCATTAATGCATTCTGAGTTC AC 
AGAGCAAATTAGGAGAATCATTTCCAACCATTATTTACTGCAGTATGGGGAGTAAATTTATA 

ACTGTAACACAGCCTGTAAAGTTAGCCATATAAATGCAAGGGTATATCATATATACAAATCAGGAATCAGGTCCGTTCA 
CCG AACTTC AAA TTG ATGTTTAC TAATATTTTTGTG AC AGAG TATAAAG AC C CT AT AGTGGGTAAATTAGATAC T ATT A 
GC AT ATT ATT AATTT AATG TC TTT ATC ATTGG ATC TTTTGC ATGCTTT AATCTG G TTAAC AT ATTT AAATTTG C TTTTT 
TTCTCTTTACCTGAAGGCTCTGTGTATAGTATTTCATGACA 
GTATTTAAATATTGCAAATATGTTTAATTATACAAATCAGAATAGTATGGGTAA 
TCTTTCTGCAGCCGACTTAGACATGCTCTTCCCTTTCT 

TATTTTCAGGTTATGTCATCTAACTTATAGCAAACTACCACAATACAGTGAGTTCTGC 

ATTTCAGGTGTGGCTGTGGAATGTAAAAATGCTCAACTTGTATCAGGTAATC 

ATTAATCGGGTACATGTTACTGTAATTAACTCATTGCACTTCAAAACC 

AG TATTGTCATTTG TTTTTG TTTTATTG AAAAG T AATGTTGTCTT AAG ATTT AG AAGT^ TTG AG AAC TAT 

TACCCAGCTCTAAGCAAATAATGATTGTATACATATTAAGA 

AATGTAATTCCTTTATGGAGATTTATTGTGCT^GCCCTAAGCTTC 

GACTGGCAGGGGAAAGAATGGTAGAGACAGAAATTAAGACTTTATCCTTC 

ATGTAACATTTGTCTGTTCCAGTGATGTAAGGATATTAAGTTATTAAGCT 
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TTAACTTAGATATTTCATAGCTGGATTTAGGAAGATC 

GTCTGCATTCACTAATTCATGTTCCAGAAGAGGAAATAATGAAGATATACTC^ 

AAATTTGCTCATAAAATCTCTTATAAAACGTGCATATAACAAAATGACACCCAGTAGGCCTC 

CGTGTTTATTTGCCATCAAATAAACTGAGTACTGACACCAGACAAAGACTC 

TGCAGCAAGACAGGAGGTCAGCTCGCCTATAATGGTGCTTAAAGTGTGATTGATGTAATTTTCTGTACTCACCATT^ 

AGTTAGTTAAGGAGAACTTTATTTTTTTAAAAAAAGTAAATGG^ 

AATCCACTCCGTTTTTAAAGCAAAATTATCTTGTGATT^ 

CAGTCTGCAAGCTTTCAGTAGTTTTCTAGTGCTATATTCATCCTGTAAAACTC 
AAAGTGTCCCCTTTGCATATTTCTTTAAAATTCTTTC 

CAAAACCAGAGCAAAATGCTAAATACGTTATTGCTAATCAGTGGTCTCAAATC 
GGGCTGTAAGCCTGAAGATAGTGGCAAGCACCAAGTCAGTTTCCAAAATTGCCCCTC 
CCCTGCCTCAGCTTCAGCAGGCGTAGGCTCACCCTGGGCGGAGCAAAGTATGGGCCAGGGAGAACTA 
CCTGCTGTCGAGTTGAGAAAAGGGGAGAATTTATGGTCTGAATTTTC 

AATACACAAAGGCTTCCAGACCTGAGCCACACCCAGGCCCTATCCTGAACAGGAGACTAAACAGAGGCAAATCAACCC 
AGGAAATACTTGCATTCTGCCCTACGGTTAGTACCAGGACTGAGGTCATTTC^ 

TATCTGATCGCTTGAGACTCCTAATAGGCAGGAGTCAAGGCCACTAGAAAATTGACAGTTAAGAGCCAAAA 

AATATGCTACTCTGAAAAATCTCGTGAAGGCTGTAGGAAAAGGGAGAATCTTCCATGTTC 

CAGTTTGGGGTATGATATAAGCAGGTATTAATAAAAATAACACACCAAAGAGTTACGTAAAACATC 

GGTCCCC ACGT AC AG AC ATTTTATTTCT ATTTTG AAATGAGTTATC TATTTTCATAAAAGTAAAAC AC T ATT AAAGTGC 
TGTTTTATGTGAAATAACTTGAATGTTGTTCCTATAAAAAATAG 

TTAGATTTTTATGAGGAATGAGTATCTGGAAATATTGTAGCAATACTTGGTTTAAAA 

CTGTCTAATGTAATCCTTTAAAAATTCTCTGCATTCTC 

TAAAGTTTATGAAGTTATATTTATCAAATAAAAACTTTCCTATAT 
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Protein Family / Domain Matches, HMMer version 2 



Searching for complete domains in PFAM 

hmmpfam - search a single seq against HMM database 

HMMER 2.1.1 (Dec 1998) 

Copyright (C) 1992-1998 Washington University School of Medicine 

HMMER is freely distributed under the GNU General Public License (GPL) . 

HMM file: /prod/ddm/seqanal/PFAM/pfam6.4/Pfam 
Sequence file : /prod/ddm/wspace/orf anal/oa-script . 15759 . seq 

Query: 44181 

Scores for sequence family classification (score includes all domains) : 
Model Description Score E-value N 



42.8 8e-09 
8.6 0.13 



Hydrolase haloacid dehalogenase-like hydrolase 

1 

El-E2_ATPase E1-E2 ATPase 
1 

DUF132 Protein of unknown function DUF132 -72.9 9.4 

1 

Parsed for domains : 



Model 


Domain 


seq-f 


seq-t 


hmm- f 


hmm-t 




score 


E-value 


El-E2_ATPase 


1/1 


126 


164 . 


37 


75 




8 


6 


0 . 13 


DUF132 


1/1 


579 


719 . 


1 


160 


[] 


-72 


9 


9.4 


Hydrolase 


1/1 


401 


842 . 


1 


184 


[] 


42 


8 


8e-09 



Alignments of top-scoring domains: 

El-E2_ATPase: domain 1 of 1, from 126 to 164: score 8.6, E = 0.13 
* - >VlRdGkeeeipaeeLvpGDiVevkpGdrVPADgrvvege< - * 
V+R G++++ ++ +++GDiV+++ ++ PAD+++++++ 
44181 126 WRSGGLVKTRS KNI RVGD I VR I AKDE I FPADLVLLSSD 164 

DUF132: domain 1 of 1, from 579 to 719: score -72.9, E = 9.4 

*->MeeiklkWIDTsVliaA LispkGlaf kllelLf eeKleN . . 

M v++ A+++++L+ kG +1 +++ +e ++ 

44181 579 MS VIVQApsgekLLFAKGAESSILPKCIGGEIEKtr 614 

. YtSdeiLeEyifkillpKLekklpvEvslkkvl . wlvskSkvinPRSF 
++ + L+ + +1 + +k+ + E ++ + + +++ + 
44181 615 iHVDEFALKGL- -RTLCIAYRKFTSKE- - YEEIDkRIFEARTALQQR 657 

KESntkFnvcRDpeDNKFLn . . . wYesKAdvlITyDkDLLdRLRDENkk 
+k + F++++ + + A +L d+ R 

44181 658 EEKLAAV FQFIEkdl ILLGATA VEDRLQDKVRETIEA 694 

lkledHefkvLTPkEFiesveKkls<-* 
1++ + + + + vLT v +ls 

44181 695 LRMAGIKVWVLTGDKHETAVSVSLS 719 

Hydrolase: domain 1 of 1 , from 401 to 842: score 42.8, E = 8e-09 

* - > ikawFDkDGTLt dgk 

+ v+ Dk+GTLt + ++++++++++ +++ ++++++++++ 
44181 401 VEYVFTDKTGTLTENEmqf rec s ingmkyqe ingr 1 vpegp tpds se 447 



++ + ++ ++ ++ ++ +++++ +++++++++ ++++ + + ++ 
44181 448 gnlsylsslshlnnlshlttsssf rtspenetelikehdlf fkavslcht 497 

eppiaeaivealrelgl . . . . 

+ ++ +++ +++++ +++ +++ + + p ++a+vea++++g+ + 
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44181 498 vqisnvqtdctgdgpwqsnlapsqleyyaSSPDEKALVEAAARIGIvfig 547 

apleevekllgrgl .g . . . erilleggltaell 

+++e +e +++ 1++ + +++++ ++++ + ++ 

44181 548 NSEETMEVKTLGKLeRyklLHILEFDSDRRRMSvivqapsgekllfakga 597 



+ + + ++ +++ + + + + + ++ ++ ++ ++++ ++ +++ 

44181 598 essilpkciggeiektrihvdefalkglrtlciayrkftskeyeeidkri 647 

Id. evlglial . dklypgarealkaLke 

+ ++ ++++ + d +lg+ a++d 1 + +re+++aL+ 

44181 648 f ear t a 1 qqr eekl aa vf qf i eKD 1 1 LLGATAVeDRLQDKVRET I EALRM 697 

rGikvailTngdr .naeallealgla . If daivdsdevggvgpwvgKPk 
+Gikv++1T++ +++a+ + ++ + + + + + + K + 
44181 698 AG I KVWVLTGDKHeTAVS VSlSCGHFHRTMNIL ELINQKSD 738 

peif llalerlgvkpeevg 

e+++++++ ++ e ++ +++ + +++ + +++++ + ++ 
44181 739 SECAEQLRQLARRITE--Dhviqhglwdgtslslalreheklfmevcrn 786 

p . kvlmvGDginDapalaaAGvgv 

+ +++++++ +++ +++l+vGDg nD+ ++ A+vg+ 

44181 787 csavlccrmaplqkakvirlikispeKpITLAVGDGANDVSMIQEAHVGI 836 

amgngg<-* 
+ 

44181 837 GIMGKE 842 
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:£ioSTAL W (1.74) multiple sequence alignment 



'bh67076FL 

fouseATlH 

'bh44181 



T>h67076FL 

louseATlH 

*bh44181 



, bh67076FL 

louseATlH 

^bh44181 



T3hg7076FL 

fouMeATlH 

?bh^!81 



MFRRSLNRFCAGEEKRVGTRTVFVGN-HPVSETEAYIAQRFCDNRIVSSKYTLWNF 

MDCSLLRTLVRRYCAGEENWVDSRTIYVGHKEPPPGAEAYIPQRYPDNRIVSSKYTFWNF 

MWRW I RQ0LGFDPPHQSDTRT I YVAH --RF PQNGL YTPQKF I DNR 1 1 S S KYTVWNF 

. * . • .**-.*. **..****.******** 



L PKNLFEQFRRI ANFYFL 1 1 FLVQVTVD TPTS PVTSGLPLFFVITVTAI WQGYEDWLRHR 
IPKNLFEQFRRIANFYFLIIFLVQLIIDgTT^PVTSGLPLFFVITVTAigQGYEDWLRHK 
VPKNLFEQFRRVANFYFLg I FLVQLMIDTPTS ^VTSGLPLFFVITVTAI? KQGYEDWLRHN 
: ********** . ************ ; : ******************************** 9 

ADNEVNKSTVYIIENAKJRVRKESEKIKVGDVVEVQADETFPCDLILLSSCTTDGTCYVTF 
ADNAMNQCPVHFIQHGKLVRKQSRKLRVGDI VMVKEDETFPCDLI FLSSNRADGTCHVT T 
SDNEVNGAPVY WRSGGLVKTRS KN I RVGD I VR I AKDE I FPADLVLLS SDRLDG SCHVTjT, 
.** . * : ** ** ^ ** : : *★* ★* ; * : *^ ** 



as ldgesnckthyavrdti alctaes idtlraai eceqpqpdlykfvgrini ysnsleav 
asSgesshkthyavqdtkgfhteadvdslhatieceqpqpdlykfvgrinvyndlndpv 
as lpgetnlf kthvavpetallqtvanldtlvav i ecqqpeadl yrfmgrm - 1 1 tqqmee i 

******. *★* *★ .* * . * . * * ***.★*. *** : *.** : : m : . : 



7bft&7076FL 

fodseATlH 

?bh44181 



?bli67076FL 

tou^eATlH 

Fbh44181 



Fbh67076FL 

MouseATlH 

Fbh44181 



ARS LGPENLLLKGATLKNTEKI YGVAVYTGMETKMALNYQGKSQKRS AVEKS INAi L I VY 
VRPLGSENLLLRGATLKNTEKIFGVAIYTGMETKMALNYQSK^ LIVY 
VRPIX3PESLLLRGARLKNTKEI FGVAVYTGMET 

* ** * ****** ****..*.***.************-.***********;*;***:* 

X^1> i 



LFILLTKAAVCTTLKYVWQSTPYNDEPWYNQKTQKERETLKVLKMFTDF L.SFMVL.FNFI I 
LCILVSKALINTVLKYVVtoSEPFRDEPWYNEKTESERQRNLFLRAFTDF LAFMVLFNYI I 
LVILISEAVISTILKYTWR AEE KWDEPWYNQKTEHQRNS SKI LRF I S D FlLAFLVLYNFI I 

* **.-.* : * *****. ★★****.★*. ; * : . * ; ..***.* : ** ; * ; ** 

' ' " pK^k.l v ,4 T-^^v 

PVSMYVT^ "~~ ~ 
PVSMYVTVEI^KFLGSYFITWDEDMFDEEMGEGPLVNTSDLNEELGQVEYIfF 
P I SLYVTVEMQKFLGS FF I GWDL.DLYHEESDQKAQVNTSDLNEELGQVE' 




KTGTLTj 
KTGTLT\ 
)KTGTLT/ 



*.*.************-** ** * 



************* - * - ********* 



Fbh67076FL 

MouseATlH 

Fbh44181 



ENSMEF I ECC I DGHKYKG VTQE VDGLSQTDGTLTYFDKVD 

ENNMAFKECC I EGHVYVPH VI CNGQVLPDS SG- IDMIDSS PGVC 

ENEMQFRECSINGMKYQEINGRLVPEGPTPDSSEGNLSYLSSLSHLNNLSHLTTSSSFRT 
*******.* * : : . : * : 



Fbh6707GFL 

MouseATlH 

Fbh44181 



KNREELFLRALCLCHTVE I KTN DAVDG ATE SAELTY I S S S PDE I A 

GREREELFFRAI CLCHTVQ VKDDHCGDDVDGPQK - S PDAKS C VY ISSS PDEVA 

SPENETELIKEHDLFFKAVSLfCHTVQISNVQTDCTGDGPWQSNLAPSQLEYYASSPDEKA 
.**..*.*****-. ** . *.****** 



Fbh67076FL 

MouseATlH 

Fbh44181 



LVKGAKRYGFTFLGNRNGYMRVENQRKEIEEYELLHTI^FDAVRRRMSVIVKT 
LVEGVQRLGFTYLRLKDNYME I LNRENDI ERFELLEVLTFDS VRRRMS VIVKSTTGEIYL 
LVEAAAR I G I VF I GNS EETME VKTLG - KLER Y KLLH I LE FDSDRRRM S V I VQ AP S GEKLL 
**. **- ». ♦ * . «* ..** ***-********.. * - * 



Fbh67076FL 

MouseATlH 

Fbh44181 



FCKGADSAVFPRVQNHE I ELTKVHVERNAMDG YRTLCVAFKE I APDD YER I NRQL I EAKM 
FCKGADSS I FPRVIEGKVIXJVRSRVERNAVEGLRTLCVAYKRLEPEQYEDACRLLQSAKV 
FAKGAES S I LPKC IGGE I EKTRI HVDEFALKGLRTLC I AYRKFTS KEYEE I DKRI FE7^RT 
* ***.*. . . * . ... : * • * ****:*::.: : : -*: 



Fbh67076FL 
MouseATlH 



AL.QDREEKMEKVFDDI ETNMNLI GATAVEDKLiQDQAAETI EALHAAGLKVWVLTGDKMET 
ALQDREKKLAEAYEQI EKDLVLLGATAVEDRLQEKAADTI EALQKAGI KVWVLTGDKMET 
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*bh44181 



*bh67076FL 

louseATlH 

l>h44181 



T>h67076FL 

louseATlH 

•"bh44181 



? bh67076FL 

louseATlH 

?bh^4181 

*bh|p7076FL 

4o\i#eATlH 

?T}h#4181 



?bKj£7076FL 

touseATlH 

FbH44181 



Fbti67076FL 

MouseATlH 

FbM4181 



Fbh67076FL 

MouseATlH 

Fbh44181 



Fbh67076FL 

MouseATlH 

Fbh44181 



Fbh67076FL 

MouseATlH 

Fbh44181 



ALQQREEKLAAVFQFI EKDLI LLGATAVEDRLQDKVRETI EALRMAG I KVWVLTGD KHET 
***.**.*. *★ *.******★.**.. .*****. **.********* ** 

AKSTCYACRLFQTNTELLELTTKTIEESERKEDRLHELLIEYRKKLLHEFPKSTR-SFKK 

ASATCYACKLFRRSTQLLELTTKKLEEQS LHDVLFDLSKTVLRCSGSMTRDS FSG 

AVS VS LSCGHFHRTMN I LEL I NQKSDS EC A - EQLRQLARRI T EDHVI 

★ . * * . . . * * * . . * - 

■ » * * ■ • « « .»»••« * • • • 

AWTEHQEYGLI IDGSTLSLI LNSSQD- SSSNNYKS I FLQI CMKCTAVLCCRMAPLQKAQI 

LSTDMHDYGL 1 1 DGAALS L I MKPREDGS S SGNYRELFLEI CRNCSAVLCCRMAPLQKAQ I 

QHGLWDGTSLSLALREHEK LFMEVCRNCSAVLCCRMAPLQKAKV 

..**. .**..*** . . .*...* .*.*************.. 

VRMVKNLKGS P I TLSflbDGANDV^lLESHVG IG I KGKEGRQAARNSDYSVPKFKHLKKL 

VKLI KFSKEHPITLA IKDGANDVS MI LEAHVGIGVIGKEGRQAARNSDYAI PKFKHLKKM 

IRLIKISPEKPITIJ^DGANDV^^EAHVGIGIMGKEGRQAARNSDYAIARFKFLSKL 
....* ****..********** *.★****- *************•. .** ★ *. 

» • • a • « • • • • • • j+» • * 

llahghlyyvriahEvqyf^^ 

llvhghfyyirise lvqyffyknvcfifpqflyqf fcgfsqqtlydmyltlynisftsl 
lfvhghfyy iriai tlvqyffyknvcfitpqflyqff yclfsqqtlyd syyltlynicftsl 

*:.***:**:**: *********;*** ********* **** # ***. ^ *** ; ****** * 
PILAYSLLjSQHINIDTLTSDPRLYMKISGN^ 

PILLYSLM 3QHVG I D VXiKRD PTL YRD I AKNALLRWRV FIYWTFLGVFDALVFFFGAYFIF 

P I L I YSLLE QHVD PHVLQNKPTLYRD I S KNRLLS I IC qFLYWTILGFSHAFI FFFGSYLLg : 
*** ***.★**. * ★ ** *- * •* *•***.* .****.*. . . 

• •••• » * • • • ••••• • a m m 

T<^f jyvf^ 

-QTASLEENGK^£TONWTFGTIVFT 
~ENTTVTINGQME<^ 

G KDTS LLGNG Qf^F^NWTFGTLVFTVMV I TVT^ i54ALETHFWTW [E NHL VTWGS 1 1 FYFVFS 
. :;: ★*...*******.******.***.*.**.★..***★**;* * * * : * ★ ** 

FFWGbl IWPFLKQQRMYFVFAdhLSSVSTWIiAIILLIFISLFPEIL^V^ 

LLWQGVI WPFLSYQRMYYVFISplIjSSGPAWLGI ILLVTVGLLP DVIj KK^LCRQLWPTATE 
LFYqG ILWPFLGSQNMYF ^FIQLLSSGSAWFAIILMVVTCLFLg riKK^ 
...**.♦**** * ***** .*** .*. *** . . * . ... *. 

« •» •> m * » • * * • * ■ • **** » 

- - KNVRRRSAR RNLSCRRASDS — LSAR PSVRPLLLRTFSDESNVL 

RTQNIQHQDSISEFTPLASLPSWGAQGSRLLAAQCSSPSGRVVCSRWESEECPVLPLHPG 
KAQLTETNAGI KCLDSMCCFPEGEAACAS - VGRMLERVIGRCS PTHI SS SWS ASDPFYTN 

• • * » a *»* 



LPHKARYGCCRSSLEMPT 
DRS I LTLSTMDS STC - - - 
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Input file Fbh67084FL.seq; Output File Fbh67084FL. tra 
Sequence length 4198 

GGAGTCGACCCACGCGTCCGCATTGAGACAATGCCTCCACAAATACTTGATGCAAAATTCAGTAAGACA 



AATCACCATTATAGTTTCTGACAAATTGTTCTCAAA 

MPLMMSEEGFENEESDYHTL 20 

ATG CCA CTA ATG ATG TCT GAA GAA GGC TTT GAG AAT GAG GAA AGT GAT TAC CAC ACC TTA 60 

PRARIMQRKRGLEWFVCDGW 40 

CCA CGA GCC AGG ATA ATG CAA AGG AAA AGA GGA CTG GAG TGG TTT GTC TGT GAT GGC TGG 120 

KFLCTSCCGWLINICRRKKE 60 

AAG TTC CTC TGT ACC AGT TGC TGT GGT TGG CTG ATA AAT ATT TGT CGA AGA AAG AAA GAG 180 

LKARTVWLGCPEKCEEKHPR 80 

CTG AAA GCT CGC ACA GTA TGG CTT GGA TGT CCT GAA AAG TGT GAA GAA AAA CAT CCC AGG 240 

NSIKNQKYNVFTFIPGVLYE 100 

AAT TCT ATA AAA AAT CAA AAA TAC AAT GTG TTT ACC TTT ATA CCT GGG GTT TTG TAT GAA 300 

QFKFFLNLYFLVISCSQFVP 120 

CAA TTC AAG TTT TTC TTG AAT CTC TAT TTT CTA GTG ATA TCC TGC TCA CAG TTT GTA CCA 360 

ALKIGYLYTYWAPLGFVLAV 140 

GCA TTG AAA ATA GGC TAT CTC TAC ACC TAC TGG GCT CCT CTG GGA TTT GTC TTG GCT GTT 420 

TMTREAIDEFRRFQRDKEVN 160 

ACT ATG ACA CGG GAA GCA ATT GAT GAA TTT CGG CGT TTT CAG CGT GAC AAG GAA GTG AAT 480 

SQLYSKLTVRGKVQVKSSDI 180 

TCA CAA CTA TAT AGC AAG CTT ACA GTA AGA GGT AAA GTG CAA GTT AAG AGT TCA GAC ATA 540 

QVGDLI IVEKNQRIPSDMVF 200 

CAA GTT GGA GAC CTC ATC ATA GTG GAA AAG AAT CAA AGA ATT CCA TCG GAC ATG GTG TTT 600 

LRTSEKAGSCFIRTDQLDGE 220 

CTT AGG ACT TCA GAA AAA GCA GGT TCG TGT TTT ATT CGA ACT GAT CAA CTA GAT GGT GAA 660 

TDWKLKVAVSCTQQLPALGD 240 

ACT GAC TGG AAG CTG AAG GTG GCA GTG AGC TGC ACG CAA CAG CTG CCG GCT CTG GGG GAC 720 

LFSISAYVYAQKPQMDIHSF 260 

CTT TTT TCT ATC AGT GCT TAT GTT TAT GCT CAG AAA CCA CAA ATG GAC ATT CAC AGT TTC 780 

EGTFTREDSDPPIHESLSIE 280 

GAA GGC ACA TTT ACC AGG GAA GAC AGT GAC CCG CCC ATT CAT GAA AGT CTC AGC ATA GAA 840 

NTLWASTIVASGTVIGVVIY 300 

AAT ACA TTG TGG GCA AGC ACC ATT GTT GCA TCA GGT ACT GTA ATA GGT GTT GTC ATT TAT 900 

TGKETRSVMNTSNPKNKVGL 320 

ACC GGA AAA GAG ACT CGA AGT GTA ATG AAC ACA TCC AAT CCA AAA AAT AAG GTT GGT TTG 960 

LDLELNRLTKALFLALVALS 340 

TTG GAC CTT GAA CTC AAT CGG CTG ACG AAA GCG CTA TTT TTG GCT TTA GTT GCT CTT TCC 1020 
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IVMVTLQGFVG 
ATT GTT ATG GTA ACC TTA CAA GGA TTT GTG GGT 

LLLFSYIIPIS 
CTT CTC CTC TTT TCT TAC ATC ATT CCC ATA AGT 

AVYGWMMMKDE 
GCG GTG TAT GGA TGG ATG ATG ATG AAA GAT GAG 

STIPEELGRLV 
AGC ACT ATC CCA GAG GAA CTT GGG CGC CTG GTG 

LTQNEMI FKRL 
CTC ACC CAG AAT GAA ATG ATA TTT AAG CGG CTG 

DTMDEIQSHVR 
GAC ACG ATG GAT GAG ATC CAG AGC CAT GTC AGG 

AGGNNTGSTPL 
GCT GGT GGA AAC AAT ACT GGT TCA ACT CCA CTA 

VRKSVSSRIHE 
GTT AGG AAA AGT GTC AGT AGT CGA ATC CAT GAA 

NVTPVYESRAG 
AAC GTG ACC CCC GTG TAT GAG TCT CGG GCC GGC 

ADQDFSDENRT 
GCT GAC CAA GAC TTC AGT GAT GAG AAT CGC ACC 

ALVQWTESVGL 
GCT CTG GTG CAG TGG ACA GAG AGT GTG GGC CTC 

MQLKTPSGQVL 
ATG CAG CTG AAG ACC CCC AGT GGC CAG GTC CTC 

FTSESKRMGVI 
TTC ACC TCC GAG AGC AAG CGG ATG GGC GTC ATC 

TFYMKGADVAM 
ACA TTC TAC ATG AAG GGC GCT GAC GTG GCC ATG 

LEEECGNMARE 
CTG GAA GAG GAG TGC GGA AAC ATG GCT CGC GAA 

KALTEEQYQDF 
AAG GCG TTG ACA GAG GAG CAG TAC CAG GAC TTT 

SMHDRSLKVAA 
AGC ATG CAC GAC AGG TCC CTC AAG GTG GCC GCG 

ELLCLTGVEDQ 
GAA CTG CTG TGC CTC ACC GGC GTG GAG GAC CAG 

EMLRNAGIKIW 
GAG ATG CTG CGC AAC GCC GGG ATC AAG ATA TGG 



PWYRNLFRF 360 

CCA TGG TAC CGC AAT CTT TTT CGG TTC 1080 

LRVNLDMGK 380 

TTG CGT GTG AAC TTG GAC ATG GGC AAA 1140 

NIPGTVVRT 400 

AAC ATC CCT GGC ACG GTC GTT CGG ACC 1200 

YLLTDKTGT 420 

TAT TTA TTG ACA GAC AAA ACA GGA ACC 1260 

H LGTVSYGA 440 

CAC CTG GGC ACC GTG TCC TAT GGC GCC 1320 

DSYSQMQSQ 460 

GAC TCC TAC TCA CAG ATG CAG TCT CAA 1380 

RKAQSSAPK 480 

AGA AAA GCC CAA TCT TCA GCT CCC AAA 1440 

AVKAIVLCH 500 

GCC GTG AAA GCC ATC GTG CTG TGT CAC 1500 

VTEETEFAE 520 

GTT ACT GAG GAG ACT GAG TTC GCA GAG 1560 

YQASSPDEV 540 

TAC CAG GCT TCC AGC CCG GAT GAG GTC 1620 

TLVSRDLTS 560 

ACG CTG GTC AGC AGG GAC CTC ACC TCC 1680 

SFCILQLFP 580 

AGC TTC TGC ATT CTG CAG CTG TTT CCC 1740 

VRDESTAEI 600 

GTC AGG GAT GAA TCC ACG GCA GAA ATC 1800 

SPIVQYNDW 620 

TCT CCT ATC GTG CAG TAT AAT GAC TGG 1860 

GLRTLVVAK 640 

GGA CTG CGG ACC CTC GTG GTT GCA AAG 1920 

ESRYTQAKL 660 

GAG AGC CGA TAC ACT CAA GCC AAG CTG 1980 

VVESLEREM 680 

GTA GTC GAG AGC CTG GAG AGG GAG ATG 2040 

LQADVRPTL 700 

CTG CAG GCA GAC GTG CGG CCC ACG CTG 2100 

MLTGDKLET 720 

ATG CTA ACA GGC GAT AAA CTC GAG ACA 2160 
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ATCIAKSSHLVSRTQDIHIF 740 

GCT ACC TGC ATT GCC AAA AGT TCA CAT CTC GTG TCT AGA ACA CAA GAT ATT CAT ATT TTC 2220 

RQVTSRGEAHLELNAFRRKH 760 

AGA CAG GTA ACC AGT CGG GGA GAG GCA CAT TTG GAG CTG AAT GCA TTT CGA AGG AAG CAT 2280 

DCALVISGDSLEVCLKYYEH 780 

GAT TGT GCA CTA GTC ATA TCT GGG GAC TCT CTG GAG GTT TGT CTA AAG TAC TAC GAG CAT 2340 

EFVELACQCPAVVCCRCSPT 800 

GAA TTT GTG GAG CTG GCC TGC CAG TGC CCT GCC GTG GTT TGC TGC CGC TGC TCA CCC ACC 2400 

QKARIVTLLQQHTGRRTCAI 820 

CAG AAG GCC CGC ATT GTG ACA CTG CTG CAG CAG CAC ACA GGG AGA CGC ACC TGC GCC ATC 2460 

GDGGNDVSMIQAADCGIGIE 840 

GGT GAT GGA GGA AAT GAT GTC AGC ATG ATT CAG GCA GCA GAC TGT GGG ATT GGG ATT GAG 2520 

GKEGKQASLAADFSITQFRH 860 

GGA AAG GAG GGT AAA CAG GCC TCG CTG GCG GCC GAC TTC TCC ATC ACG CAG TTC CGG CAC 2580 

IGRLLMVHGRNSYKRSAALG 880 

ATA GGC AGG CTG CTC ATG GTG CAC GGG CGG AAC AGC TAC AAG AGG TCG GCG GCA CTC GGC 2 640 

QFVMHRGLI ISTMQAVFSSV 900 

CAG TTC GTC ATG CAC AGG GGC CTT ATC ATC TCC ACC ATG CAG GCT GTG TTT TCC TCA GTC 2700 

FYFASVPLYQGFLMVGYATI 920 

TTC TAC TTC GCA TCC GTC CCT TTG TAT CAG GGC TTC CTC ATG GTG GGG TAT GCC ACC ATA 2760 

YTMFPVFSLVLDQDVKPEMA 940 

TAC ACC ATG TTC CCA GTG TTC TCC TTA GTG CTG GAC CAG GAC GTG AAG CCA GAG ATG GCG 2820 

MLYPELYKDLTKGRSLSFKT 960 

ATG CTC TAC CCG GAG CTG TAC AAG GAC CTC ACC AAG GGA AGA TCC TTG TCC TTC AAA ACC 2880 

FL IWVLI S IYQGGILMYGAL 980 

TTC CTC ATC TGG GTT TTA ATA AGT ATT TAC CAA GGC GGC ATC CTC ATG TAT GGG GCC CTG 2940 

VLFESEFVHVVAISFTALIL 1000 

GTG CTC TTC GAG TCT GAG TTC GTC CAC GTG GTG GCC ATC TCC TTC ACC GCA CTG ATC CTG 3 000 

TELLMVALTVRTWHWLMVVA 1020 

ACC GAG CTG CTG ATG GTG GCG CTG ACC GTC CGC ACG TGG CAC TGG CTG ATG GTG GTG GCC 3060 

EFLSLGCYVSSLAFLNEYFD 1040 

GAG TTC CTC AGC TTA GGC TGC TAC GTG TCC TCA CTC GCT TTT CTC AAT GAA TAT TTT GAT 3120 

VAFITTVTFLWKVSAITVVS 1060 

GTT GCC TTT ATC ACC ACC GTG ACC TTC CTG TGG AAA GTG TCG GCG ATC ACC GTG GTC AGC 3180 

CLPLYVLKYLRRKSSPPSYC 1080 

TGC CTC CCG CTG TAT GTC CTC AAG TAC CTG AGG CGC AAG TCT TCT CCT CCC AGC TAC TGC 3240 

K L A S * 1085 

AAG CTG GCC TCC TAA 3255 

GGGGCTGTGCACCCCCAGCGGGCTGGCCCCAGCACCTTCTGCCCTTCCCAGCACCTTC 
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GGTTTGCCATTGCTACCAAGCAAGCACCACAAGAAAGGGAGGGTACGCCAGGCGAGCCCAGGGCACAGATGCTGAGACA 
GCCTCTCCTTCTCAGTGCAGGGACGTCACCCCTGCCAGGCAAGCCCAGGGCACAGATGC 

AGTGCGAGGCTTCACCCCTGCCAGGCAAGCCCAGGGCATAGATGCTGAGACAGCCTCTCCCTCTCAGTGCAGGGACGTC 
ACCCCTGCCAGGCAAGCCCAGGGCACAGAGGCCGGGACGGCCTCTCCCTCTCAGTGTGAGGCTTCACCCATGCTAGGCA 
AGCCCAGGGCACAGATGCCGGGATGGCCCCTCCCTCTCAGTGCGGGAACGTCACCCCTGCCAGGCAAGCCCAGGGCACA 
GATGCTGCGATGGCCTCTTCCTCTTAAGTGTGGGGCCTCACCCCT 

ATTTCCATATTGAAGCAGCTTGAGTTTCTACTGAAAATGAGCCCGAATTATTTCACTATTACTGT 

ACTCTGGCATTCTGAGAATTAGACTGAAAGTTTAATTTCTGCAGTTCCCTCATATTCAG 

ACACAAAGTCATTCCTACTCAAATGTAATAAAATTGAGGCTCCACGGAGAAAAAAAA 
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Protein Family / Domain Matches, HMMer version 2 



Searching for complete domains in PFAM 

hmmpfam - search a single seq against HMM database 

HMMER 2.1.1 (Dec 1998) 

Copyright (C) 1992-1998 Washington University School of Medicine 

HMMER is freely distributed under the GNU General Public License (GPL) . 

HMM file: /prod/ddm/seqanal/PFAM/pfamG .4/Pfam 

Sequence file : /prod/ddm/wspace/orf anal/oa-script . 163 15 . seq 

Query: 67084FL 

Scores for sequence family classification (score includes all domains) : 
Model Description Score E-value N 



19.2 0.0051 



Hydrolase haloacid dehalogenase-like hydrolase 

El-E2_ATPase E1-E2 ATPase 15 • 8 0.00087 

2 

Parsed for domains : 

Model Domain seq-f seq-t hmm-f hmm-t score E-value 



El -E2 ATPase 1/2 171 199 .. 42 70 .. 3.0 6.9 

El-E2~ATPase 2/2 277 305 .. 105 133 .. 13.0 0.0064 

Hydrolase l/l 410 843 . . 1 184 [] 19.2 0.0051 

Alignments of top-scoring domains: 

El-E2_ATPase: domain 1 of 2 , from 171 to 199: score 3.0, E = 6.9 

* - >keeeipaeeLvpGDiVevkpGdrVPADgr< - * 

+ ++++++++++GD+++V+ r+P D++ 
67084FL 171 G KVQ VKS S D I Q VGD L 1 1 VE KNQ RIPS DMV 199 

El-E2__ATPase: domain 2 of 2 , from 277 to 305: score 13.0, E = 0.0064 

* - >lergnmVf aGTlwsGsltgvVtatGddT< - * 

1 + n +++a+T+v sG+ +gvV+ tG++T 
67084FL 277 LSIENTLWASTIVASGTVIGWIYTGKET 305 

Hydrolase: domain 1 of 1 , from 410 to 843: score 19.2, E = 0.0051 

* - >ikawFDkDGTLtdgkeppiaeaivealrelgl apleevekl 

+ + + Dk+GTLt+ + i + + +g ++ +++ ++ ++ 

67084FL 410 LVYLLTDKTGTLTQ - - NEM I FKRLHLGTVS YGAdtmde IQSHVRDS Y 454 

Igrgl .g erilleggltaell 

++++++++++++ +++r++++++ + ++ +++ ++ + + + + + 

67084FL 455 SQMQSqAggnntgstpLRKAQSSAPKVRKSvssriheavkaivlchnvtp 504 



+++ + + + + + + + + + ++++++ +++++ +++ + + 

67084FL 505 vyesragvteetef aeadqdf sdenrtyqasspdevalvqwtesvgltlv 554 



+ + + + + + +++ + + + ++++++ + + + + + + + + 

67084FL 555 srdl tsmqlktpsgqyl sf cilqlf pf tseskrmgvivrdestaeit f ym 604 



+ + + ++ + + +++ + + + + + + + + + + + + + +++ 
67084FL 605 kgadvamspivqyndwleeecgnmareglrtlwakkalteeqyqdf esr 654 

Id . evlgl ial . dklypgarealkaLk 

+ + + ++++ + ++ + + + e + 1 + 1 ++d+l ++r++l+ L+ 

67084FL 655 ytqaklsmhdrslkvaawesleREmELLCLTGVeDQLQADVRPTLEMLR 704 
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erGikvailTngdr . naealle 

+Gik+++lT++ ++a+ ++++++ +++++ + ++ +++++ + + + 
67084FL 705 NAGIKIWMLTGDKLeTATCIAKsshlvsrtqdihif rqvtsrgeahleln 754 

algla . If daivdsdevggvgpwvgKPkpe 

+++++ ++++ + +1 + ++++ +++++ +w+ + +p 
67084FL 755 af rrkhdcalvisgdslevCLK- YyEHEFVELACQCP AWCCRCSPT 800 

if HalerlgvkpeevgpkvlmvGDginDapalaaAGvgvamgngg< - * 
+ +++ 1+ + ++++GDg nD+ ++ aA++g+ + 
67084FL 801 QKARIVTLLQQHTGRR TCAIGDGGNDVSMIQAADCG IGIEGKE 843 
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LUSTALf W (1.74) multiple sequence alignment 



bh67084FL 
AT2B 



'bh67084FL 
AT2B 



, bh67084FL 
iAT2B 



*bh67084FL 
nAT2B 



MPLMMSEEGFENEESDYHTLPRARIMQRKRGLEWFVCDGWKFLCTSCCGWLINICRRKKE 

MPLMMSEEGFENDESDYHTLPRARITRRKRGLEWFVCGGWKFLCTSCCDWLIITVCQRK^ 
************ .************ : ********** m ********** **** : * . **** 

Jt^\ 

LKARTVWLGCPEKCEEKHPRNS I KNQKYNVFTFI PGVLYEQFkIfFLNLYFLVI SCSQFVP \ 



LKARTVWLGC PEKCEEKH PRNS I KNQKYNVFTF I P feVLYEQFK|FFLNLYFLWSCS|Q FVP 
****************************************************:******* 



ALKjlGYLYTYWAPLGFVLAVTMTRjEA I D E FRR FQRD KE VNSQLY S KLT VRGKVQVKS S D I 
^E^GYLYTYWAPLGFVIAVTIARfeAIDEFRRFQRDKEMNSQLYSKLTVRGKVQVKSSDI 



********************* . . *************** ; ********************* 

QVGDLI I VEKNQRI PSDKVFLRTSEKAGSCFII^^fLGETDWKLKVAVSCTQQLPALGD 
QVGDLI I VEKNQRI PSDMVFLRTSEKAGSCFIRgjD^pGETDWKLKVAVSCTQRLPALGD 
*****************************************************:****** 



MS7084FL 
TlAlgB 



Fb&«7084FL 
mA#iB 

yj 

Fbh€7084FL 
mA||2B 



LFSISAYVYAQKPQMDIHSFEGTFTREDSDPPIHESLSIENTLWASTIVASGTVIGWIY 
LFSISAYVYAQKPQLDIHSFEGTFTREDSDPPIHESLSIENTLWASTIVASGTVIGWIY 
******************************************************* 

Trv 3 

TGKETRSVMNTSNPKNKVGLLDLELNRL^ 

tgketrsvmntsnpnnk^glldlei^q ^ 

************** : *********** : **********. **:********. ****** **"*"* 

LLLFSYI I PISLR\{r^D^KAVYGW3^MMKDENI PGTWRTSTI PEELGRLVYLLIpKTGT ^ 
LLLFSYI I PI SLRVNIjPMQ kAAYGWMIMKDENI PGTWRTSTI PEELGRLVYLL TfDKTGT 
********************* **** .********************************* 



FbK£7084FL 
mAT2B 



LTONEMIFKRLHLGTVSYGADTMDEIQSHVRDSYSQMQSQAGGNNTGSTPLRKAQSSAPK 
L^NE^fVFK3lLHLGTVSYGTDTMDEIQSHVLNSYLQVHSQPSGHNPSSAPLRRSQSSTPK 
******.************.********** .** *..***.* *.***..***.★* 



Fbh67084FL 
mAT2B 



VRKSVSSRIHEAVKAIVLCHNVTPVYESRAGVTEETEFAEADQDFSDENRTYQASSPDEV 
VKKSVSSRIHEAVKAIALCHNVTPVYEARAGITGETEFAEADQDFSDENRTYQASSPDEV 
*.************** **************** ************************** 



Fbh67084FL 
mAT2B 



ALVQWTESVGLTLVSRDLTSMQLKTPSGQVLSFCILQLFPFTSESKRMGVIVRDESTAEI 
ALVRWTESVGLTLVSRDLASMQLKTPSGQVLTYCILQMFPFTSESKRMGIIVRDESTAEI 
*** . ************** : ************ ; .**** : *********** ; ********** 



Fbh67084FL 
mAT2B 



TFYMKGADVAMSPIVQYNDWLEEECGNMAREG 

TFYMKGADVAMSTIVQYNDWLEEECGNMAREGLRTLVVAKRTLTEEQYQDFESRYSQAKL 

************ ***************************; : ************* : **** 



Fbh67084FL 
mAT2B 



SMHDRSLKVAAVVESLEREMELLCLTGVEDQLQADVRPTLEMLRNAGIKIWMLTGDKLET 
SIHDRALKVAAVVESLEREMELLCLTGVEDQLQADWPTLEM^ 

* .★** .****************************************************** 



Fbh67084FL 
mAT2B 



Fbh67084FL 
TOAT2B 



Fbh67084FL 



ATC I AKS SHLVSRTQD I HI FRQVTSRGEAHLELNAFRRKHDCALVI SGDSLEVCLKYYEH 
ATCIAKSSHLVSRTQDIHVFRPVTSRGEAHLELNAFRRKHDCALVISGDSLEV 
****************** .** ************************************** 



E FVELACQC P AWCCRCS P TQ KAR I VTLLQQHTGRRT GAjl GDGGND V£ MI 2AADCGIGIE 
ELVEIACQCPAVVCCRCSPTXKAHIVTLLRQHTRKRTCAjlp 

*.****************** ** ; ***** : *** .************************* 

Ti^S 

GKEGKOASLAADFSITQFRHIGRLLMVHGRNSYKRSAAI^ 
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HAT2B 



Fbh67084FL 
mAT2B 



Fbh67084FL 



Fbh67084FL 
mAT2B 



GKEGKQASLAADFS I TQFRH I GRLLMVHGRNS YKRSA ALGQFVMHRGL 1 1 STMOAVFSSfr/ 
************************************************************ 



£X^SVPLYC^LMVGYATIYTMFPVFSLVipQDVKPEMAMLyPELYKDLTKGRSLSFKT 
FYFASVPLYO qFLMVGYATIYTMFPVFSLvE DODVKPEMAILYPELYKDLTKGRST..qFKT 
****************************************. ******************* 

It3 ^ ± ' 

F LI WVL I S I Y QGG I LM YG A LVL F^ B S E FyHft/V A I S FTAL I LTELLMV, 

^fvhvv; 



cmvK 

********************.*** ******************* ********* **^** *T 



FLIWVLI S I YOGGI LMYGAli LLFED ^FVmA/AI SFTALI LTELLXVALTO R* 




EFLSLGCYVSSLAFLjNEYF DVAglTTVTFLWKVSAITVVSCLPLYVljKY 

/AFfe] 



EFLSLGCYVAS3^FL^ YFGIGRVSFGAFLDVAF tlTTVTFLWKVSAITWSCLPLWI| KY 
*********.********* ****************************** 



Fbh67084FL 
mAT2B 



LRRKS S P P SYCKLAS 

LKRKLSPPSYSKLSS 
*.** ***** **.* 
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Input file Fbh67084alt; Output File Fbh67084alt . tra 
Sequence length 4231 

GGAGTCGACCCACGCGTCCGCATTGAGACAATGCCTCCACAAATACTTGATGCAAAATTCAGT^ 



AATCACCATTATAGTTTCTGACAAATTGTTCTCAAAAAGGTACCAGCTGGAGGATGAGTCTGCGCATTTGGATGAA 

MPLMMSEEGFENEESDYHTL 20 

ATG CCA CTA ATG ATG TCT GAA GAA GGC TTT GAG AAT GAG GAA AGT GAT TAC CAC ACC TTA 60 

PRARIMQRKRGLEWFVCDGW 40 

CCA CGA GCC AGG ATA ATG CAA AGG AAA AG A GGA CTG GAG TGG TTT GTC TGT GAT GGC TGG 120 

KFLCTSCCGWLINICRRKKE 60 

AAG TTC CTC TGT ACC AGT TGC TGT GGT TGG CTG ATA AAT ATT TGT CGA AGA AAG AAA GAG 180 

LKARTVWLGCPEKCEEKHPR 80 

Jr; CTG AAA GCT CGC ACA GTA TGG CTT GGA TGT CCT GAA AAG TGT GAA GAA AAA CAT CCC AGG 240 

m NS IKNQKYNVFTF IPGVLYE 100 

4- AAT TCT ATA AAA AAT CAA AAA TAC AAT GTG TTT ACC TTT ATA CCT GGG GTT TTG TAT GAA 300 

ffj QFKFFLNLYFLVISCSQFVP 120 

s,s CAA TTC AAG TTT TTC TTG AAT CTC TAT TTT CTA GTG ATA TCC TGC TCA CAG TTT GTA CCA 3 60 

; ALKIGYLYTYWAPLGFVLAV 140 

H; GCA TTG AAA ATA GGC TAT CTC TAC ACC TAC TGG GCT CCT CTG GGA TTT GTC TTG GCT GTT 420 

M= TMTREAIDEFRRFQRDKEVN 160 

Sj ACT ATG ACA CGG GAA GCA ATT GAT GAA TTT CGG CGT TTT CAG CGT GAC AAG GAA GTG AAT 480 

SQLYSKLTVRGKVQVKSSDI 180 

TCA CAA CTA TAT AGC AAG CTT ACA GTA AGA GGT AAA GTG CAA GTT AAG AGT TCA GAC ATA 540 

QVGDLI IVEKNQRIPSDMVF 200 

CAA GTT GGA GAC CTC ATC ATA GTG GAA AAG AAT CAA AGA ATT CCA TCG GAC ATG GTG TTT 600 

LRTSEKAGSCFIRTDQLDGE 220 

CTT AGG ACT TCA GAA AAA GCA GGT TCG TGT TTT ATT CGA ACT GAT CAA CTA GAT GGT GAA 660 

TDWKLKVAVSCTQQLPALGD 240 

ACT GAC TGG AAG CTG AAG GTG GCA GTG AGC TGC ACG CAA CAG CTG CCG GCT CTG GGG GAC 7 20 

LFSISAYVYAQKPQMDIHSF 260 

CTT TTT TCT ATC AGT GCT TAT GTT TAT GCT CAG AAA CCA CAA ATG GAC ATT CAC AGT TTC 7 80 

EGTFTREDSDPPIHESLSI E 280 

GAA GGC ACA TTT ACC AGG GAA GAC AGT GAC CCG CCC ATT CAT GAA AGT CTC AGC ATA GAA 840 

NTLWASTIVASGTVIGVVIY 300 

AAT ACA TTG TGG GCA AGC ACC ATT GTT GCA TCA GGT ACT GTA ATA GGT GTT GTC ATT TAT 900 

TGKETRSVMNTSNPKNKVGL 320 

ACC GGA AAA GAG ACT CGA AGT GTA ATG AAC ACA TCC AAT CCA AAA AAT AAG GTT GGT TTG 960 

LDLELNRLTKALFLALVALS 340 

TTG GAC CTT GAA CTC AAT CGG CTG ACG AAA GCG CTA TTT TTG GCT TTA GTT GCT CTT TCC 1020 
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IVMVTLQGFVGPWYRNLFRF 360 

ATT GTT ATG GTA ACC TTA CAA GGA TTT GTG GGT CCA TGG TAC CGC AAT CTT TTT CGG TTC 1080 

LLLFSYI I P I SLRVNLDMGK 380 

CTT CTC CTC TTT TCT TAC ATC ATT CCC ATA AGT TTG CGT GTG AAC TTG GAC ATG GGC AAA 1140 

AVYGWMMMKDENI PGTVVRT 400 

GCG GTG TAT GGA TGG ATG ATG ATG AAA GAT GAG AAC ATC CCT GGC ACG GTC GTT CGG ACC 1200 

STIPEELGRLVYLLTDKTGT 420 

AGC ACT ATC CCA GAG GAA CTT GGG CGC CTG GTG TAT TTA TTG AC A GAC AAA AC A GGA ACC 12 60 

LTQNEMIFKRLHLGTVSYGA 440 

CTC ACC CAG AAT GAA ATG ATA TTT AAG CGG CTG CAC CTG GGC ACC GTG TCC TAT GGC GCC 1320 

DTMDEIQSHVRDSYSQMQSQ 460 

GAC ACG ATG GAT GAG ATC CAG AGC CAT GTC AGG GAC TCC TAC TCA CAG ATG CAG TCT CAA 13 80 

AGGNNTGSTPLRKAQSSAPK 480 

GCT GGT GGA AAC AAT ACT GGT TCA ACT CCA CTA AGA AAA GCC CAA TCT TCA GCT CCC AAA 1440 

VRKSVSSRIHEAVKAIVLCH. 500 

GTT AGG AAA AGT GTC AGT AGT CGA ATC CAT GAA GCC GTG AAA GCC ATC GTG CTG TGT CAC 1500 

NVTPVYESRAGVTEETEFAE 520 

AAC GTG ACC CCC GTG TAT GAG TCT CGG GCC GGC GTT ACT GAG GAG ACT GAG TTC GCA GAG 1560 

ADQDFSDENRTYQASSPDEV 540 

GCT GAC CAA GAC TTC AGT GAT GAG AAT CGC ACC TAC CAG GCT TCC AGC CCG GAT GAG GTC 162 0 

ALVQWTESVGLTLVSRDLTS 560 

GCT CTG GTG CAG TGG AC A GAG AGT GTG GGC CTC ACG CTG GTC AGC AGG GAC CTC ACC TCC 1680 

MQLKTPSGQVLSFCILQLFP 580 

ATG CAG CTG AAG ACC CCC AGT GGC CAG GTC CTC AGC TTC TGC ATT CTG CAG CTG TTT CCC 174 0 

FTSESKRMGVIVRDESTAEI 600 

TTC ACC TCC GAG AGC AAG CGG ATG GGC GTC ATC GTC AGG GAT GAA TCC ACG GCA GAA ATC 180 0 

TFYMKGADVAMS PIVQYNDW 620 

AC A TTC TAC ATG AAG GGC GCT GAC GTG GCC ATG TCT CCT ATC GTG CAG TAT AAT GAC TGG 1860 

LEEECGNMAREGLRTLVVAK 640 

CTG GAA GAG GAG TGC GGA AAC ATG GCT CGC GAA GGA CTG CGG ACC CTC GTG GTT GCA AAG 1920 

KALTEEQYQDFESRYTQAKL 660 

AAG GCG TTG AC A GAG GAG CAG TAC CAG GAC TTT GAG AGC CGA TAC ACT CAA GCC AAG CTG 198 0 

SMHDRSLKVAAVVESLEREM 680 

AGC ATG CAC GAC AGG TCC CTC AAG GTG GCC GCG GTA GTC GAG AGC CTG GAG AGG GAG ATG 2040 

ELLCLTGVEDQLQADVRPTL 700 

GAA CTG CTG TGC CTC ACC GGC GTG GAG GAC CAG CTG CAG GCA GAC GTG CGG CCC ACG CTG 2100 

EMLRNAGIKIWMLTGDKLET 720 

GAG ATG CTG CGC AAC GCC GGG ATC AAG ATA TGG ATG CTA ACA GGC GAT AAA CTC GAG ACA 2160 
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ATCIAKSSHLVSRTQDIHIF 740 

GCT ACC TGC ATT GCC AAA AGT TCA CAT CTC GTG TCT AGA ACA CAA GAT ATT CAT ATT TTC 2220 

RQVTSRGEAH LELNAFRRKH 760 

AGA CAG GTA ACC AGT CGG GGA GAG GCA CAT TTG GAG CTG AAT GCA TTT CGA AGG AAG CAT 2280 

DCALVISGDS LEVCLKYYEH 780 

GAT TGT GCA CTA GTC ATA TCT GGG GAC TCT CTG GAG GTT TGT CTA AAG TAC TAC GAG CAT 2340 

EFVELACQCPAVVCCRCSPT 800 

GAA TTT GTG GAG CTG GCC TGC CAG TGC CCT GCC GTG GTT TGC TGC CGC TGC TCA CCC ACC 2400 

QKAR IV^TLLQQHTGRRTCAI 820 

CAG AAG GCC CGC ATT GTG ACA CTG CTG CAG CAG CAC ACA GGG AGA CGC ACC TGC GCC ATC 24 60 

GNDVSMI QAADCGIGIE 840 

GGA AAT GAT GTC AGC ATG ATT CAG GCA GCA GAC TGT GGG ATT GGG ATT GAG 2520 

GKQASLAADFSITQFRH 860 

GGT AAA CAG GCC TCG CTG GCG GCC GAC TTC TCC ATC ACG CAG TTC CGG CAC 2580 

IGRLLMVHGRNSYKRSAALG 880 

ATA GGC AGG CTG CTC ATG GTG CAC GGG CGG AAC AGC TAC AAG AGG TCG GCG GCA CTC GGC 2 640 

Ul QFVMHRGLI I STMQAVFSSV 900 

B CAG TTC GTC ATG CAC AGG GGC CTT ATC ATC TCC ACC ATG CAG GCT GTG TTT TCC TCA GTC 2700 

FYFASVPLYQGFLMVGYATI 920 

! y TTC TAC TTC GCA TCC GTC CCT TTG TAT CAG GGC TTC CTC ATG GTG GGG TAT GCC ACC ATA 27 60 

Si YTMF PVFS LVLDQDVKPEMA 940 

Q TAC ACC ATG TTC CCA GTG TTC TCC TTA GTG CTG GAC CAG GAC GTG AAG CCA GAG ATG GCG 2820 

MLYPELYKDLTKGRSLSFKT 960 

ATG CTC TAC CCG GAG CTG TAC AAG GAC CTC ACC AAG GGA AGA TCC TTG TCC TTC AAA ACC 2880 

FLIWVLIS IYQGGILMYGAL 980 

TTC CTC ATC TGG GTT TTA ATA AGT ATT TAC CAA GGC GGC ATC CTC ATG TAT GGG GCC CTG 2940 

VLFESEFVHVVAISFTALIL 1000 

GTG CTC TTC GAG TCT GAG TTC GTC CAC GTG GTG GCC ATC TCC TTC ACC GCA CTG ATC CTG 3 000 

TELLMVALTVRTWHWLMVVA 1020 

ACC GAG CTG CTG ATG GTG GCG CTG ACC GTC CGC ACG TGG CAC TGG CTG ATG GTG GTG GCC 3060 

EFLSLGCYVSSLAFLNEYFG 1040 

GAG TTC CTC AGC TTA GGC TGC TAC GTG TCC TCA CTC GCT TTT CTC AAT GAA TAT TTT GGT 3120 

IGRVSFGAFLDVAFITTVTF 1060 

ATA GGC AGA GTG TCT TTT GGA GCT TTC TTA GAT GTT GCC TTT ATC ACC ACC GTG ACC TTC 3180 

LWKVSAITVVSCLPLYVLKY 1080 

CTG TGG AAA GTG TCG GCG ATC ACC GTG GTC AGC TGC CTC CCG CTG TAT GTC CTC AAG TAC 3240 

LRRKSSPPSYCKLAS* 1096 

CTG AGG CGC AAG TCT TCT CCT CCC AGC TAC TGC AAG CTG GCC TCC TAA 3288 

GGGGCTGTX3CACCCCCAGCGGGCTGGCCCCAGCACCTTCTGCCCTTCCCAGCACCTTGTC 



G D G 
GGT GAT GGA 

G K E 
GGA AAG GAG 
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GGTTTGCCATTGCTACCAAGCAAGCAC^^ 
GCCTCTCCTTCTCAGTGCAGGGACGTCACCCCTGCCAGGC^ 

AGTGCGAGGCTTCACCCCTGCCAGGCAAGCCCAGGGCATAGATGCTGAGACAGCCTCTCCCTCTCAGTGCAGGGACGTC 

ACCCCTGCCAGGCAAGCCCAGGGCACAGAGGCCGGGACGGCCTCTCCCTCTCAGTGTGAGGCTTCACCCATGCTAGGCA 

AGCCCAGGGCACAGATGCCGGGATGGCCCCTCCCTCTCAGTGCGGGAACGTCACCCCTGCCAGGCAAGCCCAGGGCACA 

GATGCTGCGATGGCCTCTTCCTCTTAAGTGTGGGGCCTCACCCCTC 

ATTTCCATATTGAAGCAGCTTGAGTTTCTACTGAAAATGAGCCCGAATT^ 

ACTCTGGCATTCTGAGAATTAGACTGAAAGTTTAATTTCTGCAGTTCCCTCATAT^ 

ACACAAAGTCATTCCTACTCAAATGTAATAAAATTGAGGCTCCACGGAGAA^ 
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Protein Family / Domain Matches, HMMer version 2 



Searching for complete domains in PFAM 

hmmpfam - search a single seq against HMM database 

HMMER 2.1.1 (Dec 1998) 

Copyright (C) 1992-1998 Washington University School of Medicine 
HMMER is freely distributed under the GNU General Public License (GPL) 



HMM file: 
Sequence file: 



/prod/ddm/ seqanal /PFAM/pf am6 . 4 /Pf am 
/prod/ddm/wspace/orf anal /oa- script . 17118 - seq 



Query: 67084alt 

Scores for sequence family classification (score includes all domains) : 
Model 



Hydrolase 
1 

El-E2_ATPase E1-E2 ATPase 
2 



Description 
haloacid dehalogenase- like hydrolase 



Score 
19.2 
15.8 



E-value N 
0.0051 
0 . 00087 



Parsed for domains: 

Model Domain seq-f seq-t 



El-E2_ATPase 1/2 
El-E2_ATPase 2/2 
Hydrolase l/l 



171 199 
277 305 
410 843 



hmm-f hmm-t 

42 70 . . 

105 133 . . 

1 184 [] 



score E-value 

3.0 6.9 

13.0 0.0064 

19.2 0.0051 



Alignments of top-scoring domains: 

El-E2_ATPase: domain 1 of 2, from 171 to 199: score 3.0, E = 6.9 

* - >keeeipaeeLvpGDiVevkpGdrVPADgr<- * 

+ +-t-++++++ + +GD+ ++v+ r+P D+ + 
67084alt 171 GKVQVKS SD I QVGDL I I VEKNQR I PS DMV 199 

El-E2_ATPase: domain 2 of 2, from 277 to 305: score 13.0, E = 0.0064 

* - >lergnmVf aGTl wsGsltgvVtatGddT< - * 

1 + n+++a+T+v sG+ +gvV+ tG++T 
67084alt 277 LS IENTLWASTI VASGTVIGWI YTGKET 305 

Hydrolase: domain 1 of l, from 410 to 843: score 19.2, E = 0.0051 

* - >ikawFDkDGTLtdgkeppiaeaivealrelgl apleevekl 

+ ++ Dk+GTLt+ + i + + +g ++ +++ + + ++ 

67084alt 410 LVYLLTDKTGTLTQ- -NEMI FKRLHLGTVSYGAdtmde IQSHVRDSY 454 



67084alt 



lgrgl .g erilleggltaell 

+++++++++++++++r++++++ + ++ +++ ++ + ++ ++ 

455 SQMQSqAggnntgstpLRKAQSSAPKVRKSvssriheavkaivlchnvtp 504 



+++ + +++++ + + + ++++++ +++++ +++ + + 

67084alt 505 vyesragvteetef aeadqdf sdenrtyqasspdevalvqwtesvgl t lv 554 



+++ ++ +++++ + + ++++++ + +++++ + + 

67084alt 555 srdltsmqlktpsgqylsf cilqlf pf tseskrmgvivrdestaeitf ym 604 



++ + ++ ++ +++ ++ +++ ++ ++ + + +- + +++ 
6 70 84alt 6 05 kgadvamspivqyndwleeecgnmareglrtlwakkalteeqyqdf esr 654 



+ + ++++ + 



Id . evlgl ial . dklypgarealkaLk 

++ +++ e+1+1 ++d+l ++r++l+ L+ 



67084alt 655 ytqaklsmhdrslkvaawesleREmELLCLTGVeDQLQADVRPTLEMLR 704 



FIGURE 38A 



e rG ikva i 1 Tngdr . naea 1 1 e 

+Gik+++1T++ ++a+ ++++++ ++ + + + + ++ +++++ + + + 
67084alt 705 NAGIKIWMLTGDKLeTATCIAKsshlvsrtqdihif rqvtsrgeahleln 754 

algla. If daivdsdevggvgpvwgKPkpe 

+++ + + ++++ + +1 + +++ + +++++ +w+ + +p 

67084alt 755 af rrkhdcalvisgdslevCLK-YyEHEFVELACQCP- - -AWCCRCSPT 800 

i f 1 1 al er lgvkpeevgpkvlmvGDginDapalaaAGvgvamgngg< - * 
+ +++ 1+ + ++++GDg nD+ ++ aA++g+ + 
67084alt 801 QKAR I VTLLQQHTGRR TCAIGDGGNDVSMIQAADCGIGIEGKE 843 



FIGURE 38B 



CLUSTAL W (1.74) multiple sequence alignment 



Fbh67084alt 
mAT2B 



Fbh67084alt 
mAT2B 



Fbh67084alt 
mAT2B 



Fbh67084alt 
mAT2B 



MPLMMSEEGFENEESDYHTLPRARIMQRKRGLEWFVCDGWKFLCTSCCGWLINICRRKKE 
MPLMMSEEGFENDESDYHTLPRARITRRKRGLEWFVCGGWKFLCTSCCDWLINVCQRKKE 
************.************ : **********. ************** .*.**** 

LKARTVWLGCPEKCEEKHPRNS I KNQKYNVFTFI PGVLYEOFKdFLNLYFLVISCS QFVP 3 
LKARTVWLGCPEKCEEKHPRNS I KNQKYNVFTFI E(GVj^EQFKFFLNLYFLVVSCSgFVF 
****************************************************:******* 



ALKft GYL YT YWAPLGF VLAVTMTR EAIDEFRR FQRDKEVNSQL YS KLTVRGKVQVKS S D I 
~5QCIGYLYTYWAPLGFV LAVTI AREAIDEFRRFQRDKEMNSQLYSKLTVRGKVQVKSSDI 

* ************* A ?7*TTT77** * * ********** ***** 



QVGDLI I VEKNQRI PSDMVFLRTSEKAGSCFI] 
QVGDLIIVEKNQRIPSDMVFLRTSEKAGSCFIR\ 



)Ql DGETDWKLKVAVSCTQQLPALGD 
)QL DGETDWKLKVAVSCTQRLPALGD 



********************************* *^******************* :****** 



FHti67084alt 
mST2B 



Fbh67084alt 
n&T2B 



Fbh67084alt 
mAT2B 



Fbh67084alt 
mST2B 



Fbh67084alt 
mAT2B 



Fbh67084alt 
mAT2B 



Fbh67084alt 
mAT2B 



Fbh67084alt 
mAT2B 



LFSISAYVYAQKPQMDIHSFEGTFTREDSDPPIHESLSIENTLWASTIVASGTVIGWIY 
LFSISAYVYAQKPQLDIHSFEGTFTREDSDPPIHESLSIENTLWASTIVASGTVIGWIY 
**************. ********************************************* 




**************:***********:**********.**:******** • * * * * *>h££*r*fc* 



LLLFSYI I P I SLRV^iLDMGKAVYGWMMMKDEN I PGTWRTS T I PEELGRLVYLL1 DKTGT 
LLLFSYI I PI SLRVNLDM^KAAYGWMIMKDENI PGTWRTSTI PEELGRLWLLT^KTGT 
«**** «» * • * - ********* ***.****:********************************* 

"K3QNEMIFKRLHLGTVSYGADTMDEIQSHVRDSYSQMQSQAGGNNTGSTPLRKAQSSAPK 
L'BQNEMVFKRLHLGTVSYGTDTMDEIQSHVLiNSYLQVHSQPSGHNPSSAPLRRSQSSTPK 
W**** .************ : ********** :** * :: **..*:*..*:***::***:** 

VRKSVSSRIHEAVKAIVLCHNVTPVYESRAGVTEETEFAEADQDFSDENRTYQASSPDEV 
VKKSVSSRIHEAVKAIALCHNVTPVYEARAGITGETEFAEADQDFSDENRTYQASSPDEV 
*.**************.**********:***:* ************************** 

ALVQWTESVGLTLVSRDLTSMQLKTPSGQVLSFCILQLFPFTSESKRMGVIVRDESTAEI 
ALVRWTESVGLTLVSRDLASMQLKTPSGQVLTYCILQMFPFTSESKRMGIIVRDESTAEI 

***.**************:************::****:***********:********** 

TFYMKGADVAMSPIVQYNDWLEEECGNMAREGLRTLWAKKALTEEQYQDFESRYTQAKL 
TFYMKGADVAMSTIVQYNDWLEEECGNMAREGLRTLWAKRTLTEEQYQDFESRYSQAKL 
************.***************************::*************:**** 

SMHDRSLKVAAVVESLEREMELLCLTGVEDQLQADVRPTLEMLRNAGIKIWMLTGDKLET 
S I HDRALKVAAWES LEREMELLCLTGVEDQLQADVRPTLEMLRNAG I Kl WMLTGDKLET 
*.***. ****************************************************** 



Fbh67084alt 
mAT2B 



Fbh67084alt 
mAT2B 



Fbh67084alt 



ATCIAKSSHLVSRTQDIHIFRQVTSRGEAHLELNAFRRKHDCALVISGDSLEVCLKYYEH 
ATCIAKSSHLVSRTQDIHVFRPVTSRGEAHLEIiNAFRRKHDCALVISGDSLEVCLRYYEH 

******************:** ***************^^T4^crV **■■**** 

efveiacqcpawccrcsptqkarivtllqqhtgrrtcaHgdggndv^qaadcgigie 
elveiiacqcpawccrc s ptxkah i vtllrqhtrkrtcahlgdggndvsm ] qaadcg i g i e 

*.****************** **•*****:*** : ****V********W********** 

GKEGKQASLAADFS ITQFRH I GRLLMVHGRNSYKRSAALGQFVNIHR ^Ll I STMQ AVFSSV 
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mAT2B 



Fbh67084alt 
mAT2B 



Fbh67084alt 
tnAT2B 



Fbh67084alt 
mAT2B 



GKEGKOASLAADFS ITOFRHIGRLIJWHGRN^ 

************************************* y> » 
-tHIp 

{J^J^SVPLYQdKiMVGYATIYTM^ 

FYFASVPLYO a^LMVGYATIYTMFPVFSL^f bQDVKPEMAILYPELYKDLTKGRSL 

-tK-g rJMl > 




FLIWVLISI 
FLIWVLISI 



7^ 



EFLSLGCYVSSLAFI^^YFGIGRVSFGAFLDVi^IT^ 
^EFLSLGCYVASLAF ]^ 

********* . ************************************************** 



Fbh67 0 8 4 al t LRRKSS PPS YCKLAS 

mAT2B LKRKLSPPSYSKLSS 

* . * * *******.* 
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